Association Rule Mining in Big Data: A New Perspective

S. Charles; N. Aarthi

Association Rule Mining in Big Data: A New Perspective

S. Charles, N. Aarthi

Abstract

An association rule is a remarkable approach to pull out frequent items from Big Data. This article gives a theoretical overview of association rule-making, Hadoop and MapReduce implementation of the association rule is performed on the different dataset given by the researcher. Further, the association rule based techniques are discussed and the efficiency of algorithms is compared in terms of scale-up, speed up and sizeup measures in big data. The goal of an association is not expected from a random sampling of all possibilities. It might just find relations of items that happen together. The performance of algorithms analyzed with respect to speed up, size up and scale up factors related to the prediction of big data analytics. However, the paper cannot boast to be a complete review of all the research work in an area. In this paper, makeup and offer an appraisal of the work carried out and done by researchers using association rule in Big Data.

Keywords

Big Data, MapReduce, Association Rule, Machine Learning, Hadoop, A-Priori, Size Up, Scale Up Speed Up.

Full Text:

PDF

References

Sonali Satija, Dr. Rajender Nath,” Performance Improvement of A-Priori algorithm Using Hadoop”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 5, Issue 6, ISSN: 2277 128X.

Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A. and Khan, S. U. “The rise of “big data” on cloud computing: Review and open research issues”. Information Systems 47 (2015).

A. P. Kulkarni and M. Khandewal, “Survey on Hadoop and Introduction to YARN”, International Journal of Emerging Technology and Advanced Engineering, Vol.4, NO. 5, May 2014.

Data Mining: Concepts and Techniques 2nd Edition, Jiawei Han and MichelineKamberThe University of Illinois at Urbana-Champaign Morgan Kaufmann, 2006.

DR. A. N. Nandakumar1, Nandita Yambem,” A Survey on Data Mining Algorithms on Apache Hadoop Platform”, International Journal of Emerging Technology and Advanced Engineering, Vol. 4, Issue 1, ISSN 2250-2459(2014).

Yang X.Y., Liu Z. &Fu Y.,”MapReduce as a Programming Model for Association Rules Algorithm on Hadoop”. Proc. of the 3rd International Conference on Information Sciences and Interaction Sciences (ICIS ’10). Chengdu, China, IEEE: (2010).

Xinhao Zhou, China Yongfeng Huang,” An Improved Parallel Association Rules Algorithm Based on MapReduce Framework for Big Data”, International Conference on Fuzzy Systems and Knowledge Discovery.

Dachuan Huang, Yang Song, Ramani Routray, Feng Qin “SmartCache: An Optimized MapReduce Implementation of Frequent Itemset Mining” The Ohio State University, IBM Research – Almaden.

Yen-hui Liang, Shiow-yang Wu,” Sequence-Growth: A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework”, International Congress on Big Data.

Li N., Zeng L., He Q. & Shi Z. “Parallel Implementation of A-Priori algorithm Based on MapReduce”, International Journal of Networked and Distributed Computing, Vol. 1, No. 2.,(2013).

Hongjian Qiu, Rong Gu, Chunfeng Yuan and Yihua Huang. YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark. In2014 IEEE 28th International Parallel & Distributed Processing Symposium Workshops, 2014.

Othman Yahya, Osman Hegazy and Ehab Ezat, “An Efficient Implementation of A-Priori algorithm based on Hadoop-MapReduce model”, International sjournal of Reviews in Computing, 2012, vol. 12, ISSN: 2076-3328.

Ferenc Kovacs and Janos Illes F`requent Itemset Mining on Hadoop, ICCC 2013 IEEE 9th International conference on Computational Cybrnetics, Tihany, Hungary, July 8-0, 2013.

Sunil Kumar Khatri,Diksha Deo,”Implementation of Enhanced A-Priori algorithm with Map Reduce for Optimizing Big Data “,BVICAM’s International Journal of Information Technology, Vol. 7 No. 2; ISSN 0973 – 5658.

Nidhi Khurana , Dr. R.K. Datta,” Pruning Large Data Sets for Finding Association rule in cloud: CBPA”, International Journal of Software and Web Sciences, vol. 5, issue 2, ISSN (Print): 2279-0063.

Sujatha R. Upadhyaya, Parallel approaches to machine learning—A comprehensive survey, Journal of Parallel and Distributed Computing, ISSN 0743-7315 Volume 73, Issue 3, March 2013.

Ananta Chandra Das, Santosh Kumar Pani, Sachi Nandan Mohanty,”A Comparative Study on Data Analytics and Big Data Analytics”, International Journal of Computer Science and Information Technology Research, Vol. 4, Issue 1,ISSN 2348-1196.

Refbacks

There are currently no refbacks.

Username
Password
Remember me