Open Access Open Access  Restricted Access Subscription or Fee Access

Outlier Detection in High Dimensional Data Using kRKM Algorithm

Nilesh Shahapure, Pankaj Chandre

Abstract


The conventional clustering algorithms similar as K-means and probabilistic clustering, their clustering outcomes are complex to the existence of outliers in the records. Uniform a few outliers can transfer the bulk of these algorithms to detect significant hidden structures representation their outcome are unpredictable. In this mounting robust clustering algorithms that intention to cluster the data, as well as to detect the outliers. Proposed new method depends on the infrequent existence of outliers in the data, which clarifies to sparsity in a carefully preferred area. Leveraging sparsity in the outlier domain, outlier-responsive robust K-means and probabilistic clustering approaches are projected. Their novelty lies on recognizing outliers while carrying out sparsity in the outlier domain complete sensibly preferred regularization. A block co-ordinate descent method is established to attain iterative algorithms with conjunction promises and minor additional computational density relating to their non-robust matching part. Kernelized versions of the robust clustering algorithms are also established to capably hold high dimensional data, to detect nonlinearly divisible clusters, or even cluster objects that are not characterized by vectors. Hence experimental results are built on the polynomial kernel for outlier.


Keywords


Robust Clustering, K-Means, Kernel Methods, Sparsit, Outlier Detection.

Full Text:

PDF

References


Pedro Forero, V.Kekatos, G.Giannakis “Robust Clustering Using Outlier-Sparcity Regularization” IEEE Transactions,Signal processing vol.60 pp. 4163 – 4177, Aug. 2012

R. Xu and D. Wunsch II, “Survey of clustering algorithms,” IEEE Trans. Neural Netw. , vol. 16, no. 3, pp. 645678, May 2005.

T. Hastie, R. Tibshirani, and J. Friedman, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction”, Springer Series in Statistics. New York: Springer, 2009.

S. Lloyd,“Least squares quantization in PCM,”IEEE Trans. Inf.Theory ,vol. 28, no. 2, pp. 129137, Mar. 1982.

C. M. Bishop, “Pattern Recognition and Machine Learning, “New York, NY: Springer, 2006.

P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,”J. Roy. Stat. Soc., Series B (Methodol.), vol. 39, pp. 138, Aug. 1977.

Schlkopf and A. J. Smola “Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond.” Cambridge, MA: MIT Press, 2002.

Schlkopf, A. J. Smola, and K. R. Mller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Computer, vol. 10, no. 5, pp. 12991319, Jul. 1998.

P. J. Huber and E. M. Ronchetti, Robust Statistics New York: Wiley, 2009.

L. Garca-Escudero, A. Gordaliza, C. Matrn, and A. Mayo-Iscar,“A review of robust clustering methods,”Adv. Data Anal. Classification. vol. 4, no. 2, pp. 89109, 2010.

R. Krishnapuram and J. M. Keller,“ A possibilistic approach to clustering,”IEEE Trans. Fuzzy Syst., vol. 1, no. 2, pp. 98110, May 1993.

N. R. Pal, K. Pal, J. M. Keller, and J. C. Bezdek, “A possibilistic fuzzy C-means clustering algorithm,” IEEE Trans. Fuzzy Syst, vol. 13, no. 4, pp. 517530, Aug. 2005.

R. N. Dave and R. Krishnapuram,“Robust clustering methods: a unified view,” IEEE Trans. Fuzzy Syst, vol. 5, no. 2, pp. 270293, 1997

K. Honda, A. Notsu, and H. Ichihashi,“Fuzzy PCA-guided robust k-means clustering IEEE Trans. Fuzzy Syst.,vol. 18, no. 1, pp. 6779, Feb. 2010.

J. M. Jolion, P. Meer, and S. Bataouche, “Robust clustering with applications in computer vision,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 13, no. 8, pp. 791802, Aug. 1991.

X. Zhuang, Y. Huang, K. Palaniappan, and Y. Zhao,“Gaussian mixture density modeling, decomposition, and applications,” Transactions on Image Processing, vol. 5, no. 9, pp. 12931302, Sep. 1996.

S. Dasgupta and Y. Freund,“Random projection trees for vector quantization,” IEEE Trans. Inf., Theory, vol. 55, no. 7, pp. 32293242, Jul. 2009.

M. Yuan and Y. Lin,“Model selection and estimation in regression with grouped variables,” J. Roy. Stat. Soc., Series Bvol.68, no.1, pp.4967, Feb. 2006.

S. Dhillon, Y. Guan, and B. Kulis, “Kernel k-means: Spectral clustering and normalized cuts, “in Proc.ACMInt.Conf.Knowl. Discovery Data Mining, Seattle, WA2004, pp. 551556


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.