Open Access Open Access  Restricted Access Subscription or Fee Access

Efficient K-Means Algorithm for Data Clustering Using Calinski Indexing

K. Sheela, N. Kamalraj

Abstract


In Data mining clustering is one of the important tools. Several research areas clustering is to be used and it describe the method for grouping the data. Describes the K-Means clustering algorithm and it has used the best validity index (Calinski index) for the attribute selection, having the value of the validity index as fitness function. Calinski index is to find the best number of clusters for the whole data set. The method is to study the maximum value of (where k is the number of clusters and is the Calinski index value for k clusters). Number of cluster can be calculated by using Calinski index values along with NMF. The rand index value and the accuracy for the Calinski value is obtained, which proves that rand index value and accuracy is better than the existing clusters.

Keywords


K-Means Clustering, Calinski Index Value, Rand Index Value, WBC Dataset.

Full Text:

PDF

References


Raghuvira Pratap, K Suvarna Vani, J Rama Devi, Dr.K Nageswara Rao,” An Efficient Density based Improved K- Medoids Clustering algorithm”, (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 2, No. 6, 2011.

A. K. Jain and R. C. Dubes, “Algorithms for Clustering Data”, Englewood Cliffs, NJ: Prentice-Hall, 1988.

H. Ralambondrainy, “A conceptual version of the K-means algorithm,” Pattern Recognit. Lett., vol. 16, no. 11, pp. 1147–1157, 1995.

S. Guha, R. Rastogi, and K. Shim, “CURE: An efficient clustering algorithm for large databases,” inProc. ACM SIGMOD Int. Conf. Management Data, Seattle, WA, 1998, pp. 73–84.

E. Knorr and R. Ng, “Algorithms for mining distance-based outliers in large datasets,” in Proc. 24th Very Large Data Bases (VLDB) Conf., New York, 1998, pp. 392–403

A.K. Jain, M.N. Murty, P.J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, Vol. 31, No. 3, September 1999.

Qin Chen and Jinping Mo, “Optimizing the Ant Clustering Model Based on k-means Algorithm”, Proceeding of the 2009 WRI World Congress on Computer Science and Information Engineering, Vol. 03, 2009, pp. 699-702.

Zhao Weili, “An Improved Entropy-Based Ant Clustering Algorithm”, Proceedings of the 2008 WASE International Conference on Information Engineering, Vol 2, 2009, pp. 41-44.

Fei Wang, Dexian Zhang and Na Bao, “Fuzzy Document Clustering Based on Ant Colony Algorithm”, Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks – Part II, Lecture Notes in Computer Science, Vol. 5552, 2009, pp. 709-716.

K. Wagstaff, C. Cardie, S. Rogers and S. Schroedl, “Constrained K-means clustering with background knowledge”, in: Proc. Of 18th Int. Conf. on Machine Learning ICML‟01, p. 577 - 584.

J. White, V. Faber, and J. Saltzman. United States Patent No. 5,467,110. Nov. 1995.

Aristidis Likas, Nikos Vlassis, and Jakob J. Verbeek, “The global k-means clustering algorithm,” The Journal of Pattern Recognition society, Elsevier, vol. 36, no. 2, pp. 451-461, 2003.

Hongyuan Zha, Xiaofeng He, Chris Ding, Horst Simon, and Ming Gu, “Bipartite graph partitioning and data clustering,” Conference on Information and Knowledge Management, Proceedings of the tenth international conference on Information and knowledge management, pp. 25-32, 2001.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.