Open Access Open Access  Restricted Access Subscription or Fee Access

Comparative study on Ontology Based Text Documents Clustering Techniques

Apeksha Charola, Sahista Machchhar

Abstract


With the problem of increased utilization of internet and the huge amount of text documents,the necessity of having efficient document clustering technology. In the field of Text mining, most work going on information retrieval and document summarization, very lessattention in domain of document clustering.Traditional text mining represents document as bag of words which has some limitation, as this method does not consider semantic relationship among the texts.Semantic text mining can overcome this limitation-using Ontology.Using ontology document represents as vector of weighted concepts. In this paper, present two type of survey.First is,Survey on pre-clustering approach.Second is, Documents clustering techniques.


Keywords


Document Clustering, Ontology, Pre-clustering, Semantic, Text Mining, Weighted Concepts

Full Text:

PDF

References


AditiSharan,Nidhi Malik,Vajeti Mala ”Extracting Concepts using Linguistic Ontology in Agriculture Domain ” Journal Of The Indian Society Of Agricultural Statistics 67(1) 2013 89-96.

S. Bloehdorn, P. Cimiano and A. Hotho and S.Staab "An Ontology-based Framework for Text Mining".

Paul Buitelaar, Philipp Cimiano and Bernardo Magnini" Ontology Learning from Text: An Overview" Book Editors IOS Press, 2003.

Lei ZHANG and Zhichao WANG”Ontology-based Clustering Algorithm with Feature Weights” Journal of Computational Information Systems 6:9 (2010) 2959-2966.

Kogilavani, Dr.P.Balasubramanie “Ontology Enhanced Clustering Based Summarization of Medical Documents” International Journal of Recent Trends in Engineering, Vol. 1, No. 1, May 2009.

Hmway Hmway Tar and Thi Thi Soe Nyaunt “Ontology-based Concept Weighting for Text Documents” World Academy of Science, Engineering and Technology 57 2011.

Kamel Nebhi " Ontology-Based Information Extraction from Twitter" Proceedings of the Workshop on Information Extraction and Entity Analytics on Social Media Data, pages 17–22, COLING 2012, Mumbai, December 2012.

Jian Ma, Wei Xu, Yong-hong Sun, Efraim Turban, Shouyang Wang, and Ou Liu “An Ontology- Based Text-Mining Method to Cluster Proposals for Research Project Selection” systems and humans, vol. 42, no. 3, may 2012.

Shady Shehata, FakhriKarray, and Mohamed S. Kamel ” An Efficient Concept-Based Mining Model for Enhancing Text Clustering” ieee transactions on knowledge and data engineering, vol. 22, no. 10, October 2010.

Rekha Baghel, Dr. Renu Dhir "A Frequent Concepts Based Document Clustering Algorithm” International Journal of Computer Applications (0975 – 8887) Volume 4 – No.5, July 2010.

JiWentian, GuoQingju, Zhong Sheng& Zhou En“Improved K-medoids Clustering Algorithm under Semantic Web” Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering (ICCSEE 2013).

B.S.Vamsi Krishna, P.Satheesh andSuneel Kumar R. “Comparative Study of K-means and Bisecting k-means Techniques in WordNet Based Document Clustering” International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-1, Issue-6, August 2012.

V.Sureka and S.C.Punitha “Approaches to Ontology Based Algorithms for Clustering Text Documents” IJCTA Sept-Oct 2012.

Ding, Y. and Foo, S. (2002). Ontology Research and Development: Part 1 – A Review of Ontology Generation. Journal of Information Science 28 (2).

FatihaBoubekeur, MohandBoughanem, Lynda Tamine and Mariam “Using WordNet for Concept-Based Document Indexing in Information Retrieval” SEMAPRO 2010 :The Fourth International Conference on Advances in Semantic Processing.

Bo Yeong Kang and Sang Jo Lee “Document Indexing : A Concept Based Approach To Term Weight Estimation” Information Processing and Management 41 (2005) 1065–1080.

DinakarJayarajan, DiptiDeodhare, B. Ravindran and SandipanSakar“ Document Clustering using Lexical Chain”.

ZakariaElberrichi, AbdelattifRahmoun and Mohamed Amine Bentaalah “Using WordNet for Text Categorization” The International Arab Journal of Information Technology, Vol. 5, No. 1, January 2008.

Anna Huang, David Milne,Eibe Frank and Ian H. Witten “Clustering Documents with Active Learning using Wikipedia”.

N. Menaga and B. Hemapriya“An Efficient Concept-Based Mining Model for Enhancing Text Clustering”International Journal of Computer Trends and Technology- volume4Issue1- 2013.

M. Thangamani and P. Thangaraj“Integrated Clustering and Feature Selection Scheme for Text Documents” Journal of Computer Science 6 (5): 536-541, 2010.

Dik L. Lee, Huei Chuang and Kent Seamons“Document Ranking and the Vector-Space Mode” IEEE March/April 1997.

Chenghua Dang and XinjunLuo “WordNet-based Document Summarization” 7th WSEAS Int. Conf. on Applied Computer & Applied Computational Science (ACACOS '08), Hangzhou, China, April 6-8, 2008.

AlexandrePassos and Jacques Wainer ”Wordnet-based metrics do not seem to helpdocument clustering ”

Andreas Hotho, Steffen Staab and GerdStumme “Wordnet improves Text Document Clustering”.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.