A Survey on Frequent Itemset Mining in Parallel Computing Environment

Pratipalsinh Zala; Hiren Kotadiya; Sanjay D. Bhanderi

A Survey on Frequent Itemset Mining in Parallel Computing Environment

Pratipalsinh Zala, Hiren Kotadiya, Sanjay D. Bhanderi

Abstract

Currently there is explosive growth of information in all the fields of marketing, science, technology etc. [15]. Frequent pattern mining is the process of knowledge discovery from immense database. Number of research papers has been discovered on frequent pattern mining. Now a day frequent pattern mining on single processor or node has become bottleneck because, millions of transactions emerge as a result of large data entities. In this paper we present a survey of various frequent pattern mining algorithms which have been proposed on parallel computing environment. Parallel computing has been an efficient way for frequent pattern mining like massive computational task. We have made a survey on very important and well known papers regarding parallel frequent pattern mining. In these papers various traditional methods have been taken as base and developed a novel approach on them parallel. Apriori [1] and FP-tree [2] have been very famous and efficient frequent pattern mining algorithms. But they are not sufficient enough in this era of data mining. A new approach of Inverted matrix [7] has been discussed regarding parallel environment. This approach also overcomes various inefficiencies of the conventional approaches of Apriori and FP-growth algorithms. We have presented a table with parameters which evaluate all approaches represented here in parallel computing environment.

Keywords

Apriori, FP-tree, Frequent Itemset, Parallel Computing

Full Text:

PDF

References

Agrawal, Rakesh, and Ramakrishnan Srikant. "Fast algorithms for mining association rules." In Proc. 20th int. conf. very large data bases, VLDB, vol. 1215, pp. 487-499. 1994.

Han, Jiawei, Jian Pei, and Yiwen Yin. "Mining frequent patterns without candidate generation." In ACM SIGMOD Record, vol. 29, no. 2, pp. 1-12. ACM, 2000.

Chen, Dehao, Chunrong Lai, Wei Hu, WenGuang Chen, Yimin Zhang, and Weimin Zheng. "Tree partition based parallel frequent pattern mining on shared memory systems." In Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International, pp. 8-pp. IEEE, 2006.

Lin, Che-Yu, Kun-Ming Yu, Wen Ouyang, and Jiayi Zhou. "An OpenCL Candidate Slicing Frequent Pattern Mining algorithm on graphic processing units." In Systems, Man, and Cybernetics (SMC), 2011 IEEE International Conference on, pp. 2344-2349. IEEE, 2011.

Javed, Asif, and Ashfaq Khokhar. "Frequent pattern mining on message passing multiprocessor systems." Distributed and Parallel Databases 16, no. 3 (2004): 321-334.

Yu, Kun-Ming, Jiayi Zhou, and Wei Chen Hsiao. "Load balancing approach parallel algorithm for frequent pattern mining." In Parallel Computing Technologies, pp. 623-631. Springer Berlin Heidelberg, 2007.

El-Hajj, Mohammad, and Osmar R. Zaïane. "Inverted matrix: Efficient discovery of frequent items in large datasets in the context of interactive mining." In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 109-118. ACM, 2003.

El-Hajj, Mohammad, and Osmar R. Zaïane. "Parallel association rule mining with minimum inter-processor communication." In Database and Expert Systems Applications, 2003. Proceedings. 14th International Workshop on, pp. 519-523. IEEE, 2003.

Jonathan Thompson, and Kristofer Schlachter, “An introduction to OpenCL programming model,” Digital version, 2012.

El-Hajj, Mohammad, and Osmar R. Zaiane, "Parallel leap: large-scale maximal pattern mining in a distributed environment," In Parallel and Distributed Systems, 12th International IEEE Conference on ICPADS, vol. 1, pp. 8-pp, 2006.

El-Hajj, Mohammad, and Osmar R. Zaïane, "Parallel Bifold: Large-scale parallel pattern mining with constraints," Distributed and Parallel Databases 20, no. 3,pp. 225-243, 2006

Liu Li, Eric Li, Yimin Zhang, and Zhizhong Tang, "Optimization of frequent itemset mining on multiple-core processor," In Proceedings of the 33rd international conference on Very large data bases VLDB Endowment, pp. 1275-1285, 2007.

Kambadur Prabhanjan, Amol Ghoting, Anshul Gupta, and Andrew Lumsdaine, "Extending task parallelism for frequent pattern mining." arXiv preprint arXiv:1211.1658 ,2012.

Yu Kun-Ming, and Jiayi Zhou, "Parallel TID-based frequent pattern mining algorithm on a PC Cluster and grid computing system," Expert Systems with Applications 37, no. 3, pp. 2486-2494, 2010.

J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed., Morgan Kaufmann Publisher, March 2006, ISBN 1-55860-901-6

http://www.khronos.org/opencl

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me