Clustering Algorithms using Different Distance

Elaiyaperumal Sakthivel; Kaliaperumal Senthamarai Kannan

Clustering Algorithms using Different Distance

Elaiyaperumal Sakthivel, Kaliaperumal Senthamarai Kannan

Abstract

Data mining istheprocessofdiscoveringmeaningfulcorrelations, trend and interesting patterns from a large volume ofdata.Clustering is the process of grouping similar data elements together. Inthis paper k-means algorithm and k-medoid algorithm is used alongwith distance measures like Euclidean, Manhattan and Squared on areal time medical data set a to group of similar patients based on theirvision ailments. The Results are compared numerically andgraphically to find the best distance measure. Experimental resultsshows that k-medoids clustering algorithm outperforms k-meansclustering. The experiment was repeated using different distancemeasures likeEuclidean, Manhatten and Squared. The results showsthat k-medoid with Euclidean distance measure forms the most densedcluster and thus it is very effective than other distance measures.

Keywords

Data Mining, Clustering, K-Means K-Medoids and Distance Measure

Full Text:

PDF

References

Dingxi Qiu (2010) A comparative study of the K-means algorithm and thenormal mixture model for clustering: Bivariate homoscedastic case,Journal of Statistical Planning and Inference, Vol-140, 1701–1711.

Daniel Graves and Witold Pedrycz (2010) Kernel-based fuzzy clusteringand fuzzy clustering: A comparative experimental study, Fuzzy SetsandSystems, Vol-161, 522–543.

Mohamed Zait and Hammou Messatfa (1997) A comparative study ofclustering methods, Future Generation Computer Systems, Vol-13,149-159.

Hanifi Guldemır and Abdulkadir Sengur (2006) Comparison of clusteringalgorithms for analog modulation classification, Expert Systems withApplications, Vol-30, 642–649.

Pamela Minicozzi, Fabio Rapallo, Enrico Scalas and Francesco Dondero(2008) Accuracy and robustness of clustering algorithms for small-sizeapplications inbioinformatics, Physica A, Vol-387, 6310–6318.

Yu Zong, Guandong Xu, Yanchun Zhang, He Jiang and Mingchu Li(2010) A robust iterative refinement clustering algorithm with smoothingsearch space, Knowledge-Based Systems, Vol-23, 389–396.

Stephen J. Redmond and Conor Heneghan (2007) A method forinitializing the K-means clustering algorithm using kd-trees, PatternRecognition Letters, Vol-28,965–973.

Rory Lewisa, Chad A. Mellob and Andrew M. White (2012) TrackingEpileptogenesis Progressions with Layered Fuzzy K-means andK-medoid Clustering, Procedia Computer Science, Vol- 9, 432 – 438.

M.C. Naldi, R.J.G.B. Campello, E.R. Hruschka and A.C.P.L.F. Carvalho(2011) Efficiency issues of evolutionary k-means, Applied SoftComputing, Vol-11 1938–1952.

Renato CordeirodeAmorim and BorisMirkin (2012 Minkowski metric,feature weighting and anomalous cluster initializing in K-Meansclustering, Pattern Recognition, Vol-45, 1061–1075.

Amir Ahmad Lipika Dey (2007) A k-mean clustering algorithm for mixednumeric and categorical data, Data & Knowledge Engineering, Vol-63,503–527.

. Fuyuan Cao, Jiye Liang, Deyu Li, Liang Bai and Chuangyin Dang(2012) A dissimilarity measure for the k-Modes clustering algorithm,Knowledge-Based Systems 26 120–127.

Francisco de A.T. de Carvalho and Camilo P. Tenório (2010) FuzzyK-means clustering algorithms for interval-valued databased on adaptivequadratic distaces Fuzzy SetsandSystems 161 2978–2999.

Francisco de A.T. De Carvalho and Yves Lechevallier (2009) Partitionalclustering algorithms for symbolic interval data based on single adaptivedistances, Pattern Recognition ,42, 1223—1236.

Pierpaolo D’Urso, Carmela Cappelli, Dario Di Lallo, and RiccardoMassari (2013) Clustering of financial time series Physica A 3922114–2129.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me