Open Access Open Access  Restricted Access Subscription or Fee Access

Hadoop Extension for RDMA Interportability and Advantage of Parallel Data Shuffle through NLM

Prashant Kanhere, Sathish Kumar Penchala

Abstract


Hadoop is a well known open-source architecture of the MapReduce programming model for distributed computing [6]. Then again, it confronts a number of issues to accomplish the best execution from the fundamental framework. These consolidate a serialization interference that decreases stage, monotonous merger and circle get to, and absence of capacity to influence most recent rapid interconnects. We illustrate Hadoop-A [1], an increasing speed structure that upgrades Hadoop with plug-in segments executed in C++ for quick information development, defeating its current confinements. A novel network system suspended consolidation calculation is acquainted with union information without reiteration and disk access. What's more, a full pipeline is intended to cover the shuffle, merge and reduce stages. Our trial results demonstrate that Hadoop-A pairs the information handling throughput of Hadoop, and diminishes CPU use by more than 38% to 40%.To improve the working of MapReduce innovation for better results, a framework is introduced. It is watched that in various periods of MapReduce, there are some rehashing steps which can be minimized and the execution time can be reduced [2].


Keywords


Serialization, Repetitive Merges, Disk Access, Inter Network Portability, RDMA Interconnects, Network-Levitated Merge, Parallel Shuffle, Hierarchical Merge, and Reduce.

Full Text:

PDF

References


Weikuan Yu, Member, IEEE, Yandong Wang, and Xinyu Que, “Design and Evaluation of Network-Levitated Merge for Hadoop Acceleration”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 3, MARCH 2014

J. Liu, J. Wu, and D.K. Panda, “High Performance RDMA-Based MPI Implementation over InfiniBand,” Int‟l J. Parallel Programming, vol. 32, pp. 167-198, 2004.

Y. Mao, R. Morris, and F. Kaashoek, Optimizing mapreduce for multicore architectures, MIT, Tech. Rep. MIT-CSAIL-TR2010-020, May 2010.

Dawei Jiang Beng Chin Ooi Lei Shi Sai Wu, The Performance of MapReduce: An Indepth Study Proceedings of the VLDB Endowment, Vol. 3, No. 1

R. Recio, P. Culley, D. Garcia, and J. Hilland, An rdma protocol specification (version 1.0), October 2002.

Y. Mao, R. Morris, and F. Kaashoek, “Optimizing MapReduce for Multicore Architectures,” Technical Report MIT-CSAIL-TR-2010-020, Massachusetts Inst. of Technology, May 2010.

C. Ranger, R. Raghuraman, A. Penmetsa, G.R. Bradski, and C. Kozyrakis, “Evaluating MapReduce for Multi-Core and Multiprocessor Systems,” Proc. IEEE 13th Int‟l Symp. High Performance Computer Architecture (HPCA ‟07), pp. 13-24, 2007.

Prashant B. Kanhere, "A Review on Apache Hadoop Performance Enhancement by using Network Levitated Merge", International Journal of Computer Applications (0975 8887) Volume 2014

Tyson Condie, Neil Conway, Peter Alvaro, Joseph M. Hellerstein MapReduce:Online Yahoo! Research

Infiniband Trade Association. http://www.infinibandta.org.

Prof. Megha Borole Efficient Topology Aware System of Network Levitated Merge for Hadoop Acceleration International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 5, Issue 3, March 2015)

Matei Zaharia, Andrew Konwinski, Anthony D. Joseph,Randy H. Katz, and Ion Stoica Improving mapreduce Hadoop performance in heterogeneous environments. TechnicalReport] UCB/EECS-2008-99, EECS Department, University of California, Berkeley, Aug 2008.

Sangwon Seo, Ingook Jang, Kyungchang Woo, Inkyo Kim,Jin-Soo Kim, and Seungryoul Maeng. HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment. In IEEE Cluster Conference, pages 1–8, August 2009.

Prashant B. Kanhere,"Apache Hadoop Acceleration by Parallel Data Shuffle, and RDMA Interconnects through Network Levitated Merge", Fifth Post Graduate Conference of Computer Engineering, CPGCON2016 (Savitribai Phule Pune University, Pune - 25 March 2016).

Raymond Wong, and Athanasios V. Vasilakos,"Accelerated PSO Swarm Search Feature Selection for Data Stream Mining Big Data", IEEE Transactions on Services Computing - DOI 10.1109/TSC.2015.2439695.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.