Open Access Open Access  Restricted Access Subscription or Fee Access

Framework to Monitor Big Data Processing in the Cloud

R.Vijaya Arjunan, Nagapandu Potti

Abstract


Big data is processed on the cloud as a series of Map and Reduce jobs. Often, huge chunks of data is pushed onto the cloud for processing, thereby overloading the processor. As a result, the jobs working on such huge chunks of data run for many days, making it quite difficult for the Technical Analysts to keep track of such jobs and to find the root cause of the delay in completing them.  This paper proposes an idea to build a framework and explains the implemented details to overcome the problems currently faced in cloud data processing, thus enhancing the data processing activities in the cloud. The purpose of this implementation is to make Hadoop Big data processing management simpler by developing web based application for provisioning, managing, and monitoring data processing activities on Cloudera apache Hadoop clusters.  This paper provides an intuitive, easy-to-use Hadoop Data processing monitoring management web dashboard backed by reporting framework to provide the collective view of the data as obtained from Job Trackers.


Keywords


Mapreduce, Hadoop, Data Processing, Data Monitoring

Full Text:

PDF

References


Satoshi Tsuchiya, Yoshinori Sakamoto, Yuichi Tsuchimoto, Vivian Lee, “Big Data Processing in the Cloud Environments”, FUJITSU Sci. Tech. J.,Vol. 48, No. 2, pp.159-168, April 2012.

Jeffrey Dean and Sanjay Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters”, Available: http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf , 2004.

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, “The Hadoop Distributed File System”, Availabe: http://storageconference.org/2010/Papers/MSST/Shvachko.pdf , 2010.

www.hadoop.apache.org

www.radar.oreilly.com/2011/01/what-is-hadoop.html

www.cloudera.com/content/cloudera/en/products-and-services/cdh/hdfs-and-mapreduce.html

www.computer.howstuffworks.com/cloud-computing1.htm

www.stackoverflow.com/questions/165945/what-is-the-best-approach-for-ipc-between-java-and-c


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.