GenMax with Index Technique for Pruning Bounded Frequent Itemsets

Dr. C. Sathya

GenMax with Index Technique for Pruning Bounded Frequent Itemsets

Dr. C. Sathya

Abstract

Mining frequent itemsets is one of the essential problems found in most of the data mining applications like extraction of association rules, correlations, multidimensional patterns, and also in some of the pattern matching tasks. Fast implementation and efficient utilization of memory for giving a solution to the problems involving frequent itemsets are highly required in transactional databases. GenMax, an algorithm which is mainly a search based is used for mining only the maximal frequent itemsets. It involves many optimization techniques to prune the original search space. A progressive focusing technique is applied here to perform maximal checking. Differential set propagation is used to perform fast frequency computation. But, the GenMax algorithm was not implemented with closed frequent itemset. To handle this issue an innovative GenMax with index Technique is presented here for quick and effective pruning of Bounded Frequent Itemsets and thereby enumerate all maximal frequent itemsets and closed frequent itemsets. The Experimental results show better scalability of improved GenMax with incremental update strategy. To evaluate the performance a comparison is made between the proposed index oriented GenMax and existing GenMax for efficient pruning of the bounded frequent Itemsets in terms of item precision and also speed.

Keywords

Itemset Mining, Bounded Itemset, Index structure, Incremental Update

Full Text:

PDF

References

G. Ramesh, W. Maniatty, and M. Zaki, “Indexing and Data Access Methods for Database Mining”, Proceedings of Seventh ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Wisconsin, USA, May 2002.

B. Lan, B. Ooi, and K.-L. Tan, “Efficient Indexing Structures for Mining Frequent Patterns”, Proceedings of Eighteenth International Conference on Data Engineering, California, pp. 453-462, March 2002.

M. El-Hajj and O.R. Zaiane, “Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining”, Proceedings of Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, pp. 109 – 118, August 2003.

G. Grahne and J. Zhu, “Mining Frequent Itemsets from Secondary Memory”, Proceedings of Fourth IEEE International Conference on Data Mining, Brighton, UK, pp. 91-98, November 2004.

E. Baralis, T. Cerquitelli, and S. Chiusano, “Index Support for Frequent Itemset Mining in a Relational DBMS”, Proceedings of Twenty First International Conference on Data Engineering, Tokyo, pp. 754-765, April 2005.

R. J. Bayardo, “Efficiently mining long patterns from databases”, Proceedings ACM SIGMOD International Conference on Management of Data, Washington, pp. 85-93, June 1998.

R. Agrawal, C. Aggarwal, and V. Prasad, “Depth First Generation of Long Patterns”, Proceedings of Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, USA, pp. 108-118 August, 2000.

D. Burdick, M. Calimlim, and J. Gehrke, “MAFIA: a maximal frequent itemset algorithm for transactional databases”, Proceedings of Seventeenth IEEE International Conference on Data Engineering, Heidelberg, Germany, pp. 443 – 452, April, 2001.

Karam Gouda and Mohammed J. Zaki, “GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets”, International Journal on Data Mining and Knowledge Discovery, Volume 11(3), pp. 223-242, November 2005.

M. Zaki and C. Hsiao, “Charm: An efficient algorithm for closed itemset mining”, Proceedings of Second SIAM International Conference on Data Mining, Arlington, USA, pp. 457-473, April 2002.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me