Open Access Open Access  Restricted Access Subscription or Fee Access

Performance Improvement Issues and Approaches for Hadoop

Rushikesh Garadade, S.B. Deshmukh

Abstract


Nowadays Hadoop is a go-to framework for Big Data Analytics. In current scenario data is growing exponentially and Hadoop is the defacto solution for this growth. Although Hadoop is popular for its  high-performance computing in data-intensive applications, increasing evidence has shown that performance of data-intensive applications can be severely limited by many factors like hardware structure of nodes, Algorithmic strategies used, architectural decisions,  and also whether it is running on physical server or virtual. This paper incorporates performance improvement in terms of movement of data between nodes, proper utilization of resources like CPU and I/O. The attempt of machine learning for improved resource utilization is also included which takes us to the new era of performance improvement.  

Keywords


Hadoop, Performance Improvement, Big Data, Machine Learning

Full Text:

PDF

References


Jaliya Ekanayake, Shrideep Pallickara, Geoffrey Fox, "MapReduce for Data Intensive Scientific Analyses", Procee.. of the 2008 IEEE International Conference on eScience, pp. 277-284.

Apache Hadoop: http://hadoop.apache.org/docs/stable/index.html

Prof. Raj Jain, Washington University: http://www.cse.wustl.edu/~jain/cse570-13/ftp/bigdatap/index.html

Haiying Shen and Yingwu Zhu. "A proactive low-overhead file replication scheme for structured p2p content delivery networks". Journal of Parallel and Distributed Computing, 69(5):429-440,2009.

R. H. Patterson, G. A. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka. Informed prefetching and caching. SIGOPS Oper. Syst. Rev., 29:79-95, December 1995.

M. Zaharia, A. Konwinski, A.D. Joseph, R. Katz, and I. Stoica. Improving mapreduce performance in heterogeneous environments. In Proc. of USENIX OSDI, 2008.

Dhok J, Varma V (2010), Using pattern classification for task assignment in mapreduce.

Shyam Deshmukh, Dr. J. V. Aghav, Chakravarthy R. ”Job Classification for MapReduce Scheduler in Heterogeneous Environment”, CUBE2013 international conference on Cloud and Ubiquitous computing and Emerging Technologies


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.