Open Access Open Access  Restricted Access Subscription or Fee Access

Pattern Discovery using Text Mining in Gujarati Language

Manthan J. Vyas

Abstract


The availability of constantly accelerative amount of textual data of various Indian regional languages in electronic form has increased. So the mining of text documents based on languages is necessary. The objective of the work is to discover frequent patterns from Gujarati text data. In this paper, we focused on developing pattern mining algorithm that can be used to find different text patterns. This paper presents an effective pattern discovery technique which uses pattern taxonomy model. In our proposed work, text files are taken as an input and we apply pattern taxonomy model with ECLAT algorithm and generate the results.

Keywords


Text Mining, Frequent Pattern, Gujarati Language

Full Text:

PDF

References


Ning Zhong, Yuefeng Li, and Sheng-Tang Wu,”Effective Pattern Discovery for Text Mining”, IEEE Transactions on Knowledge and Data Engineering, Vol.24, No.1, January 2012.

Bin zhang, Alex Marin, Brian Hutchinson, Mari Ostendorf, ”Learning Phrase Patterns for Text Classification”, IEEE Transactions on Audio, Speech, and Language Processing, Vol.21, No.6, June 2013.

K. Mythili, K. Yashodha, “A Pattern Taxonomy Model with New Pattern Discovery Model for Text Mining”, International Journal of Science and Applied Information Technology, ISSN No.2278-3083, Vol.1, No.3, 2012.

R. Sangareswari, S. Koteeswaran, ”Normalized Pattern Taxonomy Model for Effective Document Clustering”, National Conference on recent Trends in Computer Application & technology.

Kavitha Murugeshan, Neeraj RK, ”Discovering Patterns to Produce Effective Output through Text Mining Using Naïve Bayesian Algorithm”, International Journal of Innovative Technology and Exploring Engineering, ISSN: 2278- 3075, Volume-2, Issue-6, May 2013.

Alberto Apostolico, “Pattern Discovery and the Algorithmics of Surprise”.

Hiroki Arimura, “Text Data Mining with Optimized Pattern Discovery”.

A. Anil Kumar, S. Chandrasekhar, “Text Data Pre-processing and Dimensionality Reduction Techniques for Document Clustering”, International Journal of Engineering Research & Technology,,ISSN: 2278- 0181, Vol. 1 Issue 5, July – 2012.

V. Srividhya, R. Anitha,”Evaluating Preprocessing Techniques in Text Categorization”, International Journal of Computer Science and Applications, ISSN: 0974-0767, Issue 2010.

A N K Zaman, Pascal Matsakis, Charles Brown, “Evaluation of Stop Word Lists in Text Retrieval Using Latent Semantic Indexing”.

K. A. Chauhan, R. S. Patel, H. J. Joshi, “Towards improvement in Gujarati Text Information Retrieval by using Efeective Gujarati Stemmer”, ISSN: 0975–6760, Nov 12 To Oct 13, Vol.2, Issue-2, Journal of Information, Knowledge and Research in Computer Engineering.

Kartik Suba, Dipti Jiandani, Pushpak Bhattacharyya, “Hybrid Inflectional Stemmer and Rule-based Derivational Stemmer for Gujarati”.

Pratikkumar Patel, Kashyap Popat, Pushpak Bhattacharyya, “Hybrid Stemmer for Gujarati”. Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing , pages 51–55, the 23rd International Conference on Computational Linguistics, August 2010.

Juhi Ameta, Nisheeth Joshi, Iti Mathur, “A Lightweight Stemmer for Gujarati”.

Chirag Patel and Karthik Gali, “Part-Of-Speech Tagging for Gujarati Using Conditional Random Fields”, Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pages 117–122, January 2008.

Shaily G. Langhnoja, Mehul P. Barot, Darshak B. Mehta, “Web Usage Mining Using Association Rule Mining on Clustered Data for Pattern Discovery”, International Journal of Data Mining Techniques and Applications, Vol 02, Issue 01, June 2013.

Mahesh T. R., Suresh M. B.,M. Vinayababu, “Text Mining: Advancements, Challenges and Future Directions”, International Journal of Reviews in Computing, ISSN: 2076-3328, E-ISSN: 2076-3336.

Joshi Hardik, Pareek Jyoti, “Evaluation of some IR models for Gujarati Ad hoc Monolingual tasks”.

Parth Gupta, Paul Clough, Paolo Rosso and Mark Stevenson, “Overview of the Cross-Language Indian News Story Search (CLINSS) Track”.

Kolikipogu Ramakrishna, Dr. B. Padmaja Rani, “Study of Indexing Techniques to Improve the Performance of Information Retrieval in Telugu Language” , International Journal of Emerging Technology and Advanced Engineering, ISSN 2250-2459, Volume 3, Issue 1, January 2013.

Rohit Gupta, Pulkit Goyal, Sapan Diwakar, “Transliteration among Indian Languages using WX Notation”.

Ljiljana Dolamic, Jacques Savoy, “When Stop-word Lists Make the Difference”, Journal of the American Society for Information Science and Technology, issue-1, 200-203, 2009.

Anisha Radhakrishnan, Mathew Kurian, “Efficient Updating of Discovered Patterns for Text Mining: A Survey”, International Journal of Computer Applications (0975 – 8887) Volume 58– No.1, November, 2012.

Xifeng Yan, Jiawai Han, Ramin Afshar, “CloSpan: Closed Sequential Patterns in Large Datasets”.

Jiawi Han, Jian Pei, Behzad Mortazavi-Asl, Qiming Chen, Umeshwar Dayal, Mei-Chun Hsu, “FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining”.

Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Helen Pinto, “PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth”.

Jianyong Wang_and Jiawei Han, “BIDE: Efficient Mining of Frequent Closed Sequences”.

Qiankun Zhao,Sourav S. Bhowmick, “Sequential Pattern Mining: A Survey”.

Cláudia Antunes and Arlindo L. Oliveira, “Sequential Pattern Mining Algorithms: Trade-offs between Speed and Memory”.

A. Ramanathan and D. Rao, “A Lightweight Stemmer for Hindi ,” in proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics on Computational linguistics for South Asian Language (Budapest, April) workshop, 2003.

Mingjun Song and Sanguthevar Rajasekaran, “A Transaction Mapping Algorithm for Frequent Itemsets Mining”, IEEE Transactions on Knowledge and Data Engineering.

Mohammed J. Zaki, “Scalable Algorithms for Association Mining”, IEEE Transactions on Knowledge and Data Engineering, Vol.12, No.3, May-June 2000.

http://www.philippe-fournier-viger.com/


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.