Open Access Open Access  Restricted Access Subscription or Fee Access

Affinity Propagation Based Algorithm for Optimal K-means Clustering

S. Senthamarai Kannan, Dr. N. Ramaraj, Dr. S. Baskar

Abstract


K-means clustering is widely used due to its fast convergence, but it is sensitive to the initial condition. The limitation of k-means algorithm is that the user has to specify the number of clusters (K). There are some methods to initialize the number of clusters. But those methods perform worse in some cases. So we are proposing a method called affinity propagation in this paper which resolves those problems. By making use of the convergence property of K-means and the good performance of affinity propagation, we presented a new clustering strategy which can produce much lower squared error than AP and standard k-means. The efficiency and effectiveness of our method is demonstrated through extensive comparisons with other methods using UCI datasets of high dimensionality.


Keywords


K-means, affinity Propagation, Centroid initialization, Clustering optimization

Full Text:

PDF

References


Kaufman, L., Rousseeuw, P. J.: Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons. D (1990)

Frey, B. J., Dueck, D.: Clustering by passing messages between data points. Science Magazine, Vol. 315, pp. 972--976 (2007)

MacQueen, J. B.: Some Methods for classification and Analysis of Multivariate Observation. Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1,281--297 (1967)

Selim, S. Z., Ismail, M. A.: K-means-type algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Trans. Pattern Anal. Mach. Intell. 6, 81--87 (1984)

Fisher, D.: Iterative optimization and simplification of hierarchical clusterings. J. Artificial Intell. Res. 4, 147--179 (1996).

Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 1027 --1035 (2007)

Bradley, P. S., Fayyad, U. M.: Refining initial points for K-means clustering. Proceedings of the 15th Internet Conference on Machine Learning, San Francisco: Morgan Kaufmann Publishers, 91--99 (1998)

Sanjoy Dasgupta. How fast is -means? In Bernhard Sch¨olkopf and Manfred K. Warmuth, editors, COLT, volume 2777 of Lecture Notes in Computer Science,page 735. Springer, 2003.

Sariel Har-Peled and Soham Mazumdar. On corsets for k-means and k-median clustering. In STOC ’04: Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 291–300, New York, NY, USA, 2004. ACM Press.

Mary Inaba, Naoki Katoh, and Hiroshi Imai. Applications of weighted voronoi diagrams and randomization to variance-based k-clustering: (extended abstract). In SCG ’94: Proceedings of the tenth annual symposium on Computational geometry, pages 332–339, New York,NY, USA, 1994. ACM Press.

Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, and Angela Y. Wu. A local search approximation algorithm for k-means clustering. Comput. Geom., 28(2-3):89–112, 2004.

Ramgopal R. Mettu and C. Greg Plaxton. Optimal time bounds for approximate clustering. In Adnan Darwiche and Nir Friedman, editors, UAI, pages 344–351. Morgan Kaufmann, 2002.

Stuart P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28(2):129–136, 1982.

David Arthur and Sergei Vassilvitskii. How slow is the k-means method? In SCG ’06: Proceedings of the twenty-second annual symposium on computational geometry. ACM Press, 2006.

Sariel Har-Peled and Bardia Sadri. How fast is the k-means method? In SODA ’05: Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, pages 877–885, Philadelphia, PA, USA, 2005. Society for Industrial and Applied Mathematics.

S. Senthamarai Kannan and N. Ramaraj: A Novel Hybrid Feature Selection via Correlation based Memetic Search Algorithm: IIMST journal,volume 5: 2009.


Refbacks

  • There are currently no refbacks.