Open Access Open Access  Restricted Access Subscription or Fee Access

Concept Based Neighbor Cluster Ensemble Re-Clustering Method

S. Sathiya, V.P. Dhivya

Abstract


Clustering Ensemble combines the several partitions generated by different clustering algorithm into single clustering solution. The optimization-based method is proposed for the combination of cluster ensembles for the class of problems with intracluster criteria, such as Minimum-Sum-of-Squares-Clustering (MSSC). To find the solution for MSSC problem we are using simple and efficient algorithm called improved Exact Method for cluster ensemble re-clustering algorithm which uses similarity measures and distance between the weak clusters. The solution obtained by the single clustering algorithm does not provide better solution. The solution obtained by this algorithm guarantees better solutions than the ones in the individual cluster. For the MSSC problem in particular, a prototype implementation of improved Exact Method for cluster ensemble algorithm will produce a new better solution. The algorithm is particularly effective when the number of clusters is large, in which case it is able to escape the local minima found by K-means type algorithms by recombining the solutions in a Set-Covering context. The stability of the algorithm is also establish by running this algorithm several times for the same clustering problem instance, produce high-quality solutions. Finally, in experiments utilize external criteria to compute the validity of clustering. The algorithm is capable of producing high-quality results that are comparable in quality to those of the best known clustering algorithms.

Keywords


Clustering Ensemble, MSSC, K-Means Algorithm, Set Covering Context.

Full Text:

PDF

References


Ioannis T. Christou ―Coordination of Cluster Ensembles via Exact Methods‖,Vol33,.No2 February 2011.

H. Li, K. Zhang, and T. Jiang, ―Minimum Entropy Clustering and Applications to Gene Expression Analysis,‖ Proc. IEEE Conf. Computational Systems Bioinformatics, pp. 142-151, 2004.

O. du Merle, P. Hansen, B. Jaumard, and N. Mladenovich, ―An Interior Point Algorithm for Minimum Sum of Squares Clustering,‖SIAM J. Scientific Computing, vol. 21, no. 4, pp. 1484-1505,Mar. 2000.

A. Topchy, A.K. Jain, and W. Punch, ―Clustering Ensembles: Models of Consensus and Weak Partitions,‖ IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 12, pp. 1866-1881,Dec. 2005.

M.G.C. Resende and R.F. Werneck, ―A Hybrid Heuristic for the PMedian Problem,‖ J. Heuristics, vol. 10, pp. 59-88, 2004.

V. Singh, L. Mukherjee, J. Peng, and J. Xu, ―Ensemble Clustering Using Semidefinite Programming,‖ Advances in Neural Information Processing Systems, J.C. Platt, D. Koller, Y. Singer, and S. Roweis,eds., pp. 1353-1360, MIT Press, 2008.

A. Strehl and J. Ghosh, ―Cluster Ensembles—A Knowledge Re-Use Framework for Combining Multiple Partitions,‖ J. MachineLearning Research, vol. 3, pp. 583-618, 2002.

A. Asuncion and D.J. Newman, ―UCI Machine Learning Repository,‖School of Information and Computer Science, Univ. of California, http://www.ics.uci.edu/~mlearn/MLRepository.html, 2007.

D. Pelleg and A. Moore, ―X-Means: Extending K-Means with Efficient Estimation of the Number of Clusters,‖ Proc. 17th Int’l Conf. Machine Learning, pp. 727-734, 2000

A.K. Jain and A. Fred, ―Evidence Accumulation Clustering Based on the K-Means Algorithm,‖ Structural, Syntactic, and Statistical Pattern Recognition, pp. 442-451, Springer, 2002.

S.B. KOTSIANTIS, P. E. PINTELAS Recent Advances in Clustering: A Brief Survey

H. Ayad and M. Kamel, ―Cumulative Voting Consensus Method for Partitions with Variable Number of Clusters,‖ IEEE Trans. Jan. 2008. Jan. 2008.

J. Pacheco, ―A Scatter-Search Approach for the Minimum-Sum-of-Squares Clustering Problem,‖ Computers and Operations Research,vol. 32, no. 5, pp. 1325-1335, May 2005.

E. Dimitriadou, A. Weingessel, and K. Hornik, ―A combination Scheme for Fuzzy Clustering,‖ Int’l J. Pattern Recognition and Artificial Intelligence, vol. 16, no. 7, pp. 901-912, 2002.

T. Lange and J.M. Buhmann, ―Combining Partitions by Probabilistic Label Aggregation,‖ Proc. Int’l Conf. Knowledge Discovery in Databases, 2005.

S.Murali Krishna, S. Durga Bhavani, ―An Efficient Approach for Text Clustering based on Frequent Itemsets‖ European Journal of Scientific Research, ISSN 1450-216X, Vol. 42 No. 3, pp. 399-410, 2010.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.