Open Access Open Access  Restricted Access Subscription or Fee Access

Efficient Keyword Based Document Clustering Using Fuzzy C – Means Algorithm

K. Prabha, K. Vivekanandan, Dr.S. Sukumaran

Abstract


Clustering is an useful technique in the field of textual data mining. Cluster analysis divides objects into meaningful groups based on similarity between objects. The existing clustering approaches face the issues like practical applicability, very less accuracy, more classification time etc. In recent times, inclusion of fuzzy logic in clustering results in better clustering results. In order to further improve the performance of clustering, the  Fuzzy C-Means (FCMA) Algorithm is used. The keywords are extracted from the documents using LSA based document extraction. The Fuzzy partition matrix is created for the clustering process and the performance of the document clustering is greater based on the keyword when compared to the Existing K-Means Clustering and EM Algorithm. The proposed technique will be highly useful in the text mining process to increase the accuracy and performance of the text extraction process.

Keywords


Document Clustering, Fuzzy Cluster, Fuzzy C-Means, K-Means Clustering

Full Text:

PDF

References


Arun K.Pujari, “Data Mining Techniques”, University Press, First Edition, 2001.

Dr. Yogendra Kumar Jain and Sumit Vashishtha, “Efficient Retrieval of Text for Biomedical Domain using Expectation Maximization Algorithm” International Journal of Computer Science Issues (IJCSI), Vol. 8, Issue 6, No 1, November 2011.

EM algorithm - Wikipedia, the free encyclopedia en.wikipedia.org/wiki/EM algorithm

J.A.Hatigan and M.A.Wong, “ K-Means Clustering Algorithm”, -- Applied statistics, 1979.

K means clustering - Wikipedia, the free encyclopedia en.wikipedia.org/wiki/Kmeans_clustering.

K.Sathiyakumari, V.Preamsudha and G.Manimekalai, “Unsupervised Approach for Document Clustering Using Modified Fuzzy C mean Algorithm” International Journal of Computer & Organization Trends – Volume1- Issue3 - 2011.

Manish Verma, Mauly Srivastava, Neha Chack, Atul Kumar Diswar, Nidhi Gupta,“ A Comparative Study of Various Clustering Algorithms in Data Mining”.

Manjot Kaur and Navjot Kaur , “ Web Document Clustering Approaches Using K-Means Algorithm” International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 5, May 2013.

Mrs.Bharati R.Jipkate and Dr. Mrs.V.V.Gohokar, “ A Comparative Analysis of Fuzzy C-Means Clustering and K Means Clustering Algorithms” International Journal Of Computational Engineering Research , March 2011.

Pavel Berkhin, “Survey of clustering data mining techniques” Technical report, Accrue Software, San Jose, CA, 2002.

Rahul R.Papalkar, “Fuzzy clustering in web text mining and its application in ieee abstract classification”.

Sumit Goswami and Mayank Singh Shishodia, “A Fuzzy based Approach to Text Mining and Document Clustering” (IJACSA) International Journal of Advanced Computer Science and Applications, Vol.2, March 2011.

Sumit Vashishta, “Efficient Retrieval of Text for Biomedical Domain using Data Mining Algorithm”.

Vishal Gupta and Gurpreet S. Lehal, “A Survey of Text Mining Techniques and Applications” Journal of Emerging Technologies in Web Intelligence, Vol.1, No.1, August 2009.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.