Open Access Open Access  Restricted Access Subscription or Fee Access

Clustering on High Dimensional Data using Locally Linear Embedding (LLE) Techniques

T. Shalini, V. Suganya

Abstract


Clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). The dimension can be reduced by using some techniques of dimension reduction. Recently new non linear methods introduced for reducing the dimensionality of such data called Locally Linear Embedding (LLE).LLE combined with K-means clustering in to coherent frame work to adaptively select the most discriminant subspace. K-means clustering use to generate class labels and use LLE to do subspace selection.


Keywords


Clustering, High Dimension Data, Locally Linear Embedding, K-Means Clustering, Principal Component Analysis

Full Text:

PDF

References


Alone et alum Broad Pattern of gene expression revealed by clustering analysis of tumor and normal colon cancer tissues probed by oligonucleatide arrays. PNAS, 96(12): 6745-6750, 1999.

Anton Schwaighofer. Matlab interface to svmlight.In (2004)

Banerjee, A., Dhillon, I., Ghosh, J., Merugu, S., & Modha, D. (2004). A generalized maximum entropy approach to bregman co-clustering and matrix approximation. Proc. ACM Int’l Conf Knowledge Disc. Data Mining (KDD).

Barbara.D,‖ An Introduction to Cluster Analysis for Data mining‖,

Broom Head .D.S, Kirby, A new approach to dimensionality reduction:Theory and algorithms, SIAM journal of applied Mathematics 60 (6) (2000)2114-2142.

Carreira-Perpinan.M.A, Areview of dimension reduction techniques. Technical report CS-96-09, Department of Computer Science, University of Sheffield, 1997.

Cheng, Y., & Church, G. (2000). Biclustering of expression data. Proc. Int’l Symp. Mol. Bio (ISMB), 93–103.

Chrisding et al ―K-means clustering Via PCA‖ 21st International Conference on Machine Learning, Canada-2004.

Diamantras. K.I and Kung.S-Y.Principal Component Neural Networks. Theory and Applications. John Wiley & Sons, New York, Londin, Sydney, 1996.

Ding, C., & He, X. (2004). K-means clustering and principal component analysis. Int’l Conf. Machine Learning (ICML)

Ding, C., He, X., Zha, H., & Simon, H. (2002). Adaptive dimension reduction for clustering high dimensional data. Proc. IEEE Int’l Conf. Data Mining

De la Torre, F., & Kanade, T. (2006). Discriminative cluster analysis. Proc. Int’l Conf. Machine Learning.

D. DeMers and G.W. Cottrell. Non-linear dimensionality reduction. In C.L. Giles, S.J. Hanson, and J.D. Cowan, editors, Advances in Neural Information Processing Systems 5, pages 580–587. Morgan Kaufmann, San Mateo, CA, 1993.

Fukunaga .K, Olsen .D.R, An Algorithm for Finding intrinsic dimensionality of data, IEEE Transactions on Computers 20 (2) (1976) 165-171.

Halkidi.M, Batistakis.Y.―On Clustering Validation Techniques‖, 2001.

Han.J and Kamber.M. ―Data Mining Concepts and Techniques‖, the Morgan Kaufmann Publisher, August 2000. ISBN 1-55860-489-8.

Hartigan .J.A, Wang.M.A, ―A K-Means Clustering Algorithm‖, Appl.Stat, Vol.28, 1979, pp.100-108.

Hartuv.E, Schmitt. A, Lange.J, ―An Algorithm for Clustering cDNAS for Gene Expression Analysis‖, Third International Conference on Computational Molecular Biology (RECOMB) 1999.

Hotelling.H. ―Analysis of a complex of statistical variables into principal components‖. Journal of Educational Psychology, 24:417–441, 1933

Jackson .J.E. ―A User's Guide to Principal Components‖. New York: John Wiley and Sons, 1991.

Jain .A.K, Dubes .R.C Algorithm for clustering Data, Prentice Hall, 1988.

Jain .A.K, Murty.M.N, Flynn.P.J. ―Data Clustering: A Review‖ .ACM Computer Surveys, Vol.31, No.3, 1999.

John stone I.M(2009) ―Non-Linear dimensionality reduction by LLE‖.

Jolliffe, I. (2002). ―Principal component analysis‖. Springer. 2nd Edition.

Jolliffe. I.T. ―Principal Component Analysis‖. Springer-Verlag, 1986.

Jolliffe I.T. ―Principal Component Analysis‖ (Springer-Verlag, New York, 1989).

Kambhatla.N and Leen. T. K. ―Dimension reduction by local principal component analysis‖. Neural Computation 9, 1493–1516 (1997).

Kaski. S. ―Dimensionality reduction by random mapping: fast similarity computation for clustering‖. Proc. IEEE International Joint Conference on Neural Networks, 1:413-418, 1998.

Kaufman.L, Rousseeuw.P.J, ―Finding Groups in Data: An Introduction to Cluster Analysis‖. John Wiley, New York, 1990.

Kirby.M, Geometric Data Analysis: ―An Empirical Approach to Dimensionality Reduction n and the Study of Patterns‖, John Wiley and Sons, 2001.

Kohavi.R and John. G.The wrapper approach. In H. Liu and H. Motoda, editors, Feature Extraction, Construction and Selection: ―Data Mining Perspective‖. Springer Verlag, 1998.

O. Kouropteva, O. Okun, and M. Pietikäinen. Selection of the optimal parameter value for the locally linear embedding algorithm. 2002. Submitted to 1st International Conference on Fuzzy Systems and Knowledge Discovery.

M. Kramer. Nonlinear principal component analysis using auto associative neural networks. AIChE Journal 37, 233–243 (1991).

Lee.Y and Lee.C.K, ―Classification on multiple cancer types by multicategory support vector machines using gene expression data,‖ Bioinformatics, vol.19, pp.1132-1139, 2003

Li. K.-C. High dimensional data analysis via the SIR/PHD approach., April 2000. Lecture notes in progress.

Li, T., Ma, S., & Ogihara, M. (2004). Document clustering via Adaptive subspace iteration. Proc. conf. Research and development in IR (SIRGIR) (pp. 218–225).

Wolfe P.J and Belabbas .A (2008)‖Hessian eigen maps:Locally Linear techniques for high dimensional data.

Lunan .Y, Li.H, ―Clustering of time-course gene expression data using a mixed effects model with B-splines, ―Bioinformatics Vol.19, 2003, pp.474-482.

S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323-2326 (2000).

Zha, H., Ding, C., Gu, M., He, X., & Simon, H. (2002). Spectral relaxation for K-means clustering. Advances in Neural Information Processing Systems 14 (NIPS’01), 1057–1064.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.