Text Mining: State-of-the-Art and Research Directions

N. Venkata Sailaja, Dr. L. Padmasree, Dr. N. Mangathayaru


Today the size of unstructured text is increasing exponentially. The text is nothing but the combination of characters. In the environment where the size of unstructured text data is hugely more, to process such data by computers is a challenging task. Therefore to extract meaningful and useful patterns from the text, some pre-processing methods and algorithms are required. So, in general, text mining is the process of extracting valuable data and knowledge from the available unstructured text. To discover the patterns from the unstructured text is a major research issue in data mining.

In this survey, we discuss text mining, which is a young field evolved in recent past years, which deals with the areas such as information retrieval, machine learning, statistics, computational data sciences and advanced data mining. Here we have also described the main analysis tasks such as preprocessing of the original text, classification of text, clustering of text data, information extraction, classification techniques for text mining and its visualization. We also discussed future challenges of this area using different techniques, improvements and research directions in this paper.


Text Mining, Preprocessing, Text Classification, Clustering, Machine Learning, and Information Extraction.

