Open Access Open Access  Restricted Access Subscription or Fee Access

Monitoring Aspects of Cloud over the Big Data Analytics Using the Hadoop for Managing Short Files

Prerna Kumari, Himanshu Sharma, Aishwarya Shekhar

Abstract


This paper presents a review study on cloud computing and the big data analytics using the hadoop. Hadoop is an open source tool used for data storage of unstructured data. Hadoop can also be defined as the engineering part of big data which is only a predictive analysis and it is mainly used for processing and analysis of data. It has mainly two core components: HDFS (Hadoop distributed file system) which stores large amount of data in a reliable manner and another one is Map Reduce which is a function used for parallel processing of data. Hadoop does not perform well for short files as a large number of short files pose a heavy burden on the Name Node of HDFS and an increase in execution time for Map Reduce is encountered. Hadoop is designed to handle large size files and hence suffers a performance penalty while dealing with large number of short files. This research work gives an introduction about HDFS, short file problem and existing ways to deal with it. Now a day’s storage is not a big issue, the issue is how we can make sense of data and how to explain to the industry that our cloud is safe.


Keywords


Big Data Analytics, Cloud Computing, Hadoop, Short Files.

Full Text:

PDF

References


Kuyoro S. O., Ibikunle F. & Awodele O., “Cloud Computing Security Issues and Challenges”, International Journal of Computer Networks (IJCN), Volume (3): Issue (5) : 2011.

Mohammad Sajid, Zahid Raza, Cloud Computing: Issues & Challenges, International Conference on Cloud, Big Data and Trust 2013, Nov 13-15, RGPV.

Yaser Ghanam, Jennifer Ferreira, Frank Maurer, “Emerging Issues & Challenges in Cloud Computing— A Hybrid Approach,” Journal of Software Engineering and Applications, 2012, 5, 923-937, Published Online November 2012.

Huaglory Tianfield, “Security issues in cloud computing,” 2012 IEEE International Conference on Systems, Man, and Cybernetics October 14-17, 2012, COEX, Seoul, Korea.

Zhifeng Xiao and Yang Xiao, “Security and Privacy in Cloud Computing,” IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 15, NO. 2, SECOND QUARTER 2013.

Salvatore J. Stolfo, Malek Ben Salem, Angelos D. Keromytis. “Fog Computing: Mitigating Insider Data Theft Attacks in the Cloud”, IEEE Symposium on Security and Privacy Workshops, feb 2013.

Kan Yang, Xiaohua Jia, “An Efficient and Secure Dynamic Auditing Protocol for Data Storage in Cloud Computing”. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 9, SEPTEMBER 2013.

Jean Bacon, David Eyers, Thomas F. J.-M. Pasquier, Jatinder Singh,Ioannis Papagiannis, and Peter Pietzuch, , “Information Flow Control for Secure Cloud Computing”. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT VOL. 11, NO. 1, MARCH 2014.

H Cong Wang, Sherman S.M. Chow, Qian Wang, Kui Ren, and Wenjing Lou. “Privacy-Preserving Public Auditing for Secure Cloud Storage”. IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013.

Vijay Varadharajan nd Udaya Tupakula,, “Security as a Service Model for Cloud Environment”. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 11, NO. 1, MARCH 2014.

Peter Mell and Timothy Grance. The NIST definition of cloud computing (draft) recommendations of the national institute of standards and technology. Nist Special Publication, 145(6):7, 2011.

Marios D. Dikaiakos, George Pallis, Dimitrios Katsaros, Pankaj Mehra, and Athena Vakali, “Cloud Computing: Distributed Internet Computing for IT and Scientific,” 1089-7801/09 IEEE Computer Society, 2009.

J Gaurav Kakariya , Prof. Sonali Rangdale, “A HYBRID CLOUD APPROACH FOR SECURE AUTHORIZED DEDUPLICATION”, International Journal of Computer Engineering and Applications, Volume VIII, Issue I, October 14.

Lijun Mei,W.K. Chan,T.H. Tse, “ A Tale of Clouds: Paradigm Comparisons and Some Thoughts on Research Issues” IEEE Asia-Pacific Services Computing Conference, APSCC’ 08, pp 464-469, 2008.

Juniper Networks, Inc.1194 North Mathilda Avenue Sunnyvale, CA94089USA.Securing Multi-Tenancy and Cloud Computing.

W. K. Ng, Y. Wen, and H. Zhu, “Private data Deduplication protocols in cloud storage”. In S. Ossowski and P. Lecca, editors, Proceedings of the 27th Annual ACM Symposium on Applied Computing, pages 441–446. ACM, 2012.

S. Sivathanu, L. Liu, M. Yiduo, and X. Pu, ‘‘Storage Management in Virtualized Cloud Environment,’’ in Proc. 3rd IEEE Int’l Conf Cloud, 2010.

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler "The Hadoop Distributed File System" IEEE 2010.

Grant Mackey, Saba Sehrish, Jun Wang "Improving Metadata Management for Small in HDFS" 2009 IEEE.

Yang Zhang and Dan Liu "Improving the Efficiency of Storing for Small Files in HDFS" International Conference on Computer Science and Service System, 2012.

Chandrasekar S, Dakshinamurthy R, Seshakumar P G, Prabavathy B, Chitra Babu " A Novel Indexing Scheme for Efficient Handling of Small Files in Hadoop Distributed File System" International Conference on Computer Communication and Informatics (ICCCI-2013), Jan. 04 - 06, 2013, Coimbatore, INDIA

Bincy P Andrews, Binu A, A Perusal On Hadoop Small File Problem, International Journal of Computer Science Engineering and Information Technology Research,ISSN 2249-6831, Vol. 3, Issue 4, Oct 2013, pp.221-226

Vaibhav Gopal Korat, Kumar Swamy Pamu " Reduction of Data at Namenode in HDFS using harballing Technique" International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 ISSN: 2278 – 1323 .


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.