Open Access Open Access  Restricted Access Subscription or Fee Access

Unstructured Text Summarization Approach in the Age of Big Data: A Review

Triveedee Sandhya, Sahista Machchhar

Abstract


Text summarization is the task of reducing a text document with help of a software in order to generate a summary that retains the most important parts of the original document. It is not simple for individuals to manually summarize expansive reports of content information, Because of the a lot of content information being produced and associations expanded quickly with the accessibility of Big Data platforms, there is no enough time to peruse and perceive all document and make judgments based on text stuffing. Therefore, there is a great demand for summarizing text documents to deliver a demonstrative substitute for the novel documents. There are two techniques to summarize a text document 1) extractive summarization and 2) abstractive summarization. It is not easy task for people to manually summarize large documents of text data. An extractive summarization technique Chooses vital sentence, content etc. from the original text document and joined them into shorter structure to form outline. An abstractive summarization technique comprehend the original text data and re-telling it in fewer words to generate outline. In this survey paper, we exhibit an overview and Comparative Analysis of Unstructured Text Summarization Approach in the Age of Big Data.


Keywords


Text Mining, Text Summarization, Information Retrieval, Big Data Platform.

Full Text:

PDF

References


Yingjie Wang and Jun Maches, “A Comprehensive Method for Text Summarization Based on Latent Semantic Analysis,” NLPCC, Springer-Verlag Berlin Heidelberg, pp. 394–401, 2013.

Dragomir R. Radev, Hongyan Jing, Małgorzata Stys, Daniel Tam, “Centroid-based summarization of multiple documents,” Information Processing and Management, Elsevier, vol.40, pp-919–938, 2004.

Dingding Wanga, Shenghuo Zhu, Tao Li, “SumView: A Web-based engine for summarizing product reviews and customer opinions,” Expert Systems with Applications, Elsevier, vol.40, pp.27-33, 2013.

Mohd. Saif Wajid, Shivam Maurya, Mr. Ramesh Vaishya, “Sentence Similarity based text summarization using clusters,” International Journal of Scientific & Engineering Research, Vol.4, Issue 5, May-2013.

Michael Gamon, Anthony Aue, Simon Corston-Oliver, and Eric Ringger, “Pulse: Mining customer opinions from free text,” Proceedings of the 6th international symposium on intelligent data analysis, 2005.

Chien-Liang Liu, Wen-Hoar Hsaio, Chia-Hoang Lee, Gen-Chi Lu, and Emery Jou, “Movie Rating and Review Summarization in Mobile Environment,” IEEE TRANSACTIONS on systems, man, and cybernetics-part c:applications and reviews, vol. 42, no. 3, May 2012.

Nowshath K. Batcha, Normaziah A. Aziz, Sharil I. Shafie, “CRF Based Feature Extraction Applied for Supervised Automatic Text Summarization,” The 4th International Conference on Electrical Engineering and Informatics ICEEI, Elsevier, Vol.11, pp. 426 – 436, 2013.

Qing Cao, Wenjing Duan, Qiwei Gan, “Exploring determinants of voting for the “helpfulness” of online user reviews: A text mining approach,” Decision Support Systems, Elsevier, vol.50, pp.511-521, 2011.

Gunes Erkan, Dragomir R. Radev, “LexRank: Graph-based Lexical Centrality as Salience in Text Summarization,” Journal of Artificial Intelligence Research, vol.22, pp.457-479, 2004.

Rada Mihalcea, Paul Tarau, “A Language Independent Algorithm for Single and Multiple Document Summarization,”

Md. Majharul Haque, Suraiya Pervin, and Zerina Begum, “Literature Review of Automatic Multiple Documents Text Summarization,” International Journal of Innovation and Applied Studies, Innovative Space of Scientific Research Journals, Vol. 3, pp.121-129, 2013.

Md. Majharul Haque, Suraiya Pervin, and Zerina Begum, “Literature Review of Automatic Single Document Text Summarization Using NLP,” International Journal of Innovation and Applied Studies,

Innovative Space of Scientific Research Journals, Vol. 3, pp.857-865, 2013.

Barbara Rosario, “Latent Semantic Indexing: An overview,” INFOSYS 240, Final Paper, Spring 2000.

Vishal GUPTA, Gurpreet Singh LEHAL, “Automatic Punjabi Text Extractive Summarization System,” Proceedings of COLING 2012: Demonstration Papers, pp.191–198, COLING 2012, Mumbai, December 2012.

Chin-Yew Lin, “ROUGE Working Note,” Version 1.3.1, March 3, 2004.

Josef Steinberger, Karel Jeˇzek, “Evaluation Measures for ext Summarization,” Computing and Informatics, Mar-2, Vol. 28, pp.1001–1026, 2009.

Rasha Mohammed Badry, Ahmed Sharaf Eldin, Doaa Saad Elzanfally, “Text Summarization within the Latent Semantic Analysis Framework: Comparative Study,” International Journal of Computer Applications (0975–8887) Volume 81–No.11, November 2013.

Dingding Wang, Tao Li, “Weighted consensus multi-document summarization,” Information Processing and Management, Elsevier, vol.48, pp-513–523, 2012.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.