Open Access Open Access  Restricted Access Subscription or Fee Access

A Review of Big Data Challenges and Techniques

S. Saranya, Dr. N. Kavitha

Abstract


Big data is the most important trend that is defining the new emerging analytical tools. Big data has various applications in different areas like traffic control, weather forecasting, fraud detection, security, education and health care. Extraction of knowledge from massive amount of data sets has become a challenging task. Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, store and analyze it within a tolerable elapsed time. Due to widespread usage of many computing devices such as smart phones, laptops, wearable computing devices; the data processing over the internet has exceeded more than the modern computers can handle. Due to this high growth rate, the term Big Data is envisaged. However, the fast growth rate of such large data generates numerous challenges, such as data inconsistency and incompleteness, scalability, timeliness, and security. The question that arises now is how to develop a high performance platform to efficiently analyze big data and how to design an appropriate mining algorithm to find the useful things from big data. This paper begins with a brief introduction to the big data technology and its importance and also focuses on various challenges and issues that need to be emphasized. The tools used in big data technology are also discussed in detail.


Keywords


Big Data, Hadoop, Map Reduce, Pig, Hive, Hbase

Full Text:

PDF

References


https://www.idc.com/prodserv/4Pillars/bigdata

A, Katal, Wazid M, and Goudar R.H. "Big data: Issues, challenges, tools and Good practices." Noida: 2013, pp. 404 – 409, 8-10 Aug. 2013.]

Almeida, F., and Calistru, C, "The Main Challenges and Issues of Big Data Management", International Journal of Research Studies in Computing, 2(1), 2013, pp. 11-20.

Apache Hadoop (2013). HDFS Architecture Guide [Online]. Available: https://hadoop. apache.org/docs/r1.2.1/hdfs_design.ht

Amrit pal, Pinki Aggrawal, Kunal Jain, Sanjay Aggrawal “A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data using Hadoop” Forth International Conference on Communication Systems and Network Technologies, 2014.

Rahm, E., & Hai Do, H. (2000). Data cleaning: problems and current approaches. Bulletin of the Technical Committee on Data Engineering, 23(4), 3-13.),

Apache Hadoop (2013). HDFS Architecture Guide [Online]. Available: https://hadoop.apache.org/docs/r1.2.1/hdfs_design.ht

Intel, “Big Data Analaytics,”2012, http://www.intel.com/content/dam/www/public/us/en/documents/reports/data-insightspeer-research-report.pdf


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.