Open Access Open Access  Restricted Access Subscription or Fee Access

Multilevel Based Hierarchical Clustering

E. Gothai, Dr.P. Balasubramie

Abstract


Clustering, an supervised learning process is a challenging problem. Clustering result quality improves the overall structure. In this article, we propose an incremental stream of hierarchical clustering and improve the efficiency, reduce time consumption and accuracy of text categorization algorithm by forming an exact sub clustering. In this paper we propose a new method called multilevel clustering which a combination is of supervised and an unsupervised technique for form the clustering. In this method we form four levels of clustering. The proposed work uses the existing clustering algorithm. We develop and discuss algorithms for multilevel clustering method to achieve the best clustering experiment.

Keywords


Algorithms, Clustering, Experimentation of Levenshtein Distance Method,Supervised Learning, Unsupervised Learning, Cluster Formation, Similarity Measure, Learning, Edit Distance Learning, Data Mining.

Full Text:

PDF

References


Chatterjee and A. Segev, “Data Manipulation in Heterogeneous Databases,” ACM SIGMOD Record, vol. 20, no. 4, pp. 64-68, Dec.1991.

Ambroise, C., Seμze, G., Badran, F., Thiria, S.: Hierarchical clustering of Self-Organizing Maps for cloud classication. Neurocomputing, 30, (2000) 47-52.

A.Z. Broder, S.C. Glassman, M.S. Manasse, and G. Zweig,“Syntactic Clustering of the Web,” Proc. Sixth Int’l World Wide Web Conf. (WWW6), pp. 1157-1166, 1997.

Bock, H. H.: Classi¯cation and clustering : Problems for the future. In:E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, B. Burtschy (eds.): New Approaches in Classi¯cation and Data Analysis. Springer, Heidelberg (1993),3-24.

Bock, H. H.: Clustering and neural networks. In: A. Rizzi, M. Vichi, and H.-H. Bock (Eds.): Advances in Data Science and Classi¯cation. Springer,Heidelberg (1998), 265-278.

Bock, H. H., Diday, E. (Eds.): Analysis of Symbolic Data, Exploratory methods for extracting statistical information from complex data. Studies in Classi¯cation, Data Analysis and Knowledge Organization, Springer, Heidelberg (1999).

Bruno Woltzenlogel Paleo. An approximate gazetteer for GATE based on levenshtein distance. Student Section of the European Summer School in Logic, Language and Information (ESSLLI), 2007.

Chavent, M.: A monothetic clustering algorithm. Pattern Recognition Letters, 19, (1998) 989-996.

Ciampi, A., Lechevallier, Y.: Designing neural networks from statistical models : A new approach to data exploration. Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining. AAAI press,Menlo Park, Capp. (1995) 45-50.

Ciampi, A., Lechevallier, Y.: Statistical Models as Building Blocks of Neural Networks. Communications in Statistics, 26(4), (1997) 991-1009.

Dan Gusfield. Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press, New York, NY, USA, 1997.

Efficient q-Gram Filters for Finding All ε-Matches over a Given Length, Kim R. Rasmussen, Jens Stoye, and Eugene W. Myers. Journal of Computational Biology. March 1, 2006, 13(2): 296-308. doi:10.1089/cmb.2006.13.296.

Elemento, O.: Apport de l'analyse en composantes principales pourl'initialisation et la validation de cartes de Kohonen. Septiμemes Journees de la Societe Francophone de Classi¯cation, Nancy (1999).

E. Sutinen and J. Tarhio, “On Using q-Gram Locations in Approximate String Matching,” Proc. Third Ann. European Symp.Algorithms (ESA ’95), pp. 327-340, 1995.

E. Ukkonen, “Approximate String Matching with q-Grams and Maximal Matches,” Theoretical Computer Science, vol. 92, no. 1,pp. 191-211, 1992.

Gonzalo Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(1):31–88, 2001.

Gordon, A. D.: Classi¯cation : Methods for the Exploratory Analysis of Multivariate Data. Chapman & Hall, London (1981).

Hebrail, G., Debregeas, A.: Interactive interpretation of Kohonen maps applied to curves. Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining. AAAI press, Menlo Park (1998) 179-183.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.