Open Access Open Access  Restricted Access Subscription or Fee Access

Keyword Based Extraction of Information Using Map Reduce Method

D. Princy, Dr. N. Balakumar


Three characteristics define Big Data: volume, variety, and velocity. Size (volume), complexity (variability), and rate of growth (velocity) make them difficult to capture, manage, process or analyzed. Big data is large volume, heterogeneous, distributed data. Big data can be structured, unstructured or semi-structured. Usually data is generated from various different sources. Hadoop is an open source framework that supports the processing and storage of extremely large data sets in a distributed computing environment. Hadoop makes it possible to run applications on systems with thousands of commodity hardware nodes, and to handle thousands of terabytes of data. Hadoop also includes a Distributed File System (HDFS), which manages distributed data on different node and Map-Reduce for programming paradigm. Most text files are available in pdf format as per demand. Even all research papers are available in pdf format only and extracting a text from pdf format is one of the most difficult jobs. In this paper, we use a keyword based extraction method for extracting the text from txt file and with the help of these keywords we can get all the detail on that part of the research paper or any pdf file.


Big Data, Hadoop, HDFS, Map Reduce, Keyword Extraction

Full Text:



Dr. Siddaraju, Sowmya C L, Rashmi K, Rahul M ―Efficient Analysis of Big `Data Using Map Reduce Framework, International Journal of Recent Development in Engineering and Technology, June 2014.

shilpaManjitKaur,” BIG Data and Methodology- A review” ,International Journal of Advanced Research in Computer Science and Software Engineering,October 2013.

Deepak Motwani, V. K. Chaubey, A. S. Saxena - Hadoop based Information Extract from Text Document in 2016 IJSRSET

VarshaB.Bobade - Survey Paper on Big Data and Hadoop, International Research Journal of Engineering and Technology (IRJET) Jan-2016

Harin C Naik, Divyesh Joshi - A Hadoop Framework Require to Process Bigdata very easily and efficiently, 2016 IJSRSET

Prity Vijay, Bright Keshwani - Emergence of Big Data with Hadoop : A Review, IOSR Journal of Engineering (IOSRJEN), March 2016

Ashwini A. Pandagale& Anil R. Surve - Big Data Analysis Using Hadoop Framework, IJRAR- International Journal of Research and Analytical Reviews, March 2016

Harshawardhan S. Bhosale, Prof. Devendra P. Gadekar - A Review Paper on Big Data and Hadoop, International Journal of Scientific and Research Publications, Volume 4, Issue 10, October 2014.

Sheikh Ikhlaq and Dr. Bright Keswani," Computation of Big Data in Hadoop and Cloud Environment ", IOSR Journal of Engineering (IOSRJEN), Vol. 06, Issue 01 (January. 2016).


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.