A Matrix-Based Approach for Frequent Itemsets Mining
Recent advances in computer technology in terms of speed, cost, tremendous amount of computing power and decreased data processing time has spurred increased interest in data mining applications to extract useful knowledge from data. Discovering association rules that identify relationships among sets of items is an important problem in data mining. Finding frequent itemsets is computationally the most expensive step in association rule discovery and therefore it has attracted significant research attention. In this paper, a novel approach for mining complete frequent itemsets is presented. The algorithm is partially based on FP-tree hypothesis which generates a candidate set of large 2-itemsets, a matrix is formed using the support of 2-itemsets, as a result generating all possible frequent k-itemsets in the database.
Agrawal, R. et al. Mining Association Rules between Sets of Items in Large Databases. Proceedings of ACM SIGMOD International Conference on Management of Data, Washington, DC, 1993, 207-216.
Agrawal, R. and Srikant, R. Fast Algorithms for Mining Association Rules. Proceedings of the 20th International Conference on Large Databases, Santiago, Chile, 487-499.
Agrawal, R. Aggarwal, C. and Prasad, V.V.V. A Tree Projection Algorithm for Generation of Frequent Itemsets., Journal of Parallel and Distributed of Computing (Special Issue on High Performance DataMining).
Liu, J. Pan, Y. Wang, K. and Han, J. Mining Frequent Itemsets by Opportunistic Projection. Proceedings of ACM SIGKDD, EdMonton,Alberta, Canada.
Han, J. Pei, J. and Zhu, J. Mining Frequent Patterns without Candidate Generation. Proceedings of ACM SOGMOD, Dallas, TX.
Grahne, G. and Zhu, J. Efficiently Using Prefix-trees in Mining Frequent Itemsets. Proceedings of FIMI ’03, 2003.
Pei, J. Han, J. and Lakshmanan, L.V.S. Mining Frequent Itemsets with Convertible Constraints. Proceedings of 17th International Conference on Data Engineering, Heidelberg, Germany.
Wang, J. Han, J. and Pei, J. CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets. Proceedings of ACM SIGKDD,2003.
Shenoy, P. Haritsa, J.R. Sudarshan, S. Bhalotia, G. Bawa, M. and Shah,D. Turbo-charging Vertical Mining of Large Databases. Proceedings of ACM SIGMOD, Dallas, TX, USA, 22-23.
Zaki, M.J. Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering 12(3): 372-392.
Zaki, M.J. and Goude, K. Fast Vertical Mining Using Diffsets. RPI Technical Report 01-1, Rensselaer Polytechnic Institute, Troy, NY 12180 USA, New York.
Cheung D.W, et al. Efficient Mining of Association Rules in Distributed Databases. IEEE Transaction on Knowledge and Data Engineering,1996, 8(6):910-953.
I.Almaden, Quest synthetic data generation code, htt;:/www.admaden.ibm.com/cs/quest/syndata.html
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.