Open Access Open Access  Restricted Access Subscription or Fee Access

Survey on Clustering Algorithms for Text Mining

Dawlat A. Sayed, Sohair R. Fahmy

Abstract


Clustering is the process of combining groups of similar data objects in the same group based on similarity criteria (i.e. based on property groups). Typically, this cluster of documents is considered a centralized process. The application of this document cluster is done in two ways: online or offline. Of the two types, online cluster applications are generally more limited due to availability issues than offline applications. With this document clustering, you can complete a variety of tasks such as grouping domain-based documents, analyzing customer feedback, and finding meaningful hidden topics across all documents. The data used for clustering is used for normalization. In terms of efficiency and accuracy, the K-means produces better results compared to other algorithms.

Keywords


Clustering, K-Means, Hierarchical, Expectation and Maximization, Density Based Algorithm, Normalization.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.