Open Access Open Access  Restricted Access Subscription or Fee Access

A Survey on High Utility Data Mining for Increasing Transactions Databases

S. Siva, Dr. Shilpa Chaudhari

Abstract


High utility itemset mining is significantly increased and became popular because of the real-time transactions. Current high utility itemset mining techniques are focused on different individual constraint which are its own performance. But in data mining, discovering new knowledge from the large database, it is important to follow the certain constraints (mixed) to accomplish the task effectively and profitably. All the existing literature we found that the algorithms are designed and developed in static databases based on single constraints. But in case of real-time applications, certain operations like insert or delete or updating are being done for every minute. So, this paper, study about different algorithms and challenges of algorithms on high utility data mining including incremental high utility mining and periodical also. This paper also studies and propose usage of multiple constraints on high utility data mining and measure about the accuracy of the algorithms using different constraints on data, model and measures.


Keywords


Data Mining, High Utility Data Set, Transaction DB, Constraints, Data, Models and Measures

Full Text:

PDF

References


R. Agrawal, T. Imielinski, and A. Swami, "Database mining: A performance perspective," IEEE Transactions on Knowledge and Data Engineering, vol. 5(6), pp. 914-925, 1993.

R. Agrawal and R. Srikant, "Fast algorithms for mining association rules in large databases," In: Proceedings of the International Conference on Very Large Data Bases, Santiago de Chile, Chile, pp. 487-499, 1994.

J. Han, J. Pei, Y. Yin, and R. Mao, "Mining frequent patterns without candidate generation: A frequent-pattern tree approach," Data Mining and Knowledge Discovery, vol. 8(1), pp. 53-87, 2004.

R. Chan, Q. Yang, and Y. D. Shen, "Mining high utility itemsets," In: Proceedings of the IEEE International Conference on Data Mining, Melbourne, Florida, USA, pp. 19-26, 2003.

H. Yao and H. J. Hamilton, "Mining itemset utilities from transaction databases," Data & Knowledge Engineering, vol. 59(3), pp. 603-626, 2006.

C. F. Ahmed, S. K. Tanbeer, B. S. Jeong, and Y. K. Le, "Efficient tree structures for high utility pattern mining in incremental databases," IEEE Transactions on Knowledge and Data Engineering, vol. 21(12), pp. 1708-1721, 2009.

A. Erwin, R. P. Gopalan, and N. R. Achuthan, "CTU-mine: An efficient high utility itemset mining algorithm using the pattern growth approach," In: Proceedings of the IEEE International Conference on Computer and Information Technology pp. 71-76, 2007.

D. W. L. Cheung, J. Han, V. T. Ng, and C. Y. Wong, “Maintenance of discovered association rules in large databases: an incremental updating technique,” in Proceedings of the IEEE 12th International Conference on Data Engineering, pp. 106–114, March 1996

Y. Liu, W. K. Liao, and A. Choudhary, "A two-phase algorithm for fast discovery of high utility itemsets," In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, pp. 689-695, 2005.

R. T. Ng, L. Lakshmanan, J. Han, and A. Pang, "Exploratory mining and pruning optimizations of constrained association rules," ACM SIGMOD Record, vol. 27(2), pp. 13-24, 1998.

J. Pei and J. Han, "Constrained frequent pattern mining: A pattern-growth view," ACM SIGKDD Explorations Newsletter, vol. 4(1), pp. 31-39, 2002

Valerio Grossi1, Andrea Romei1, Franco Turini1 ((2017)) on Survey on using constraints in data mining. Data Min Knowl Disc 31:424– 464 - DOI 10.1007/s10618-016-0480-z

Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V. S. (2014) FHM: Faster High-Utility Itemset Mining using Estimated Utility Co-occurrence Pruning. Proc. 21st Intern. Symposium on Methodologies for Intelligent Systems (ISMIS 2014), Springer, LNAI, pp. 83-92

Nam Nguyen and Rich Caruana “Improving Classification with Pairwise Constraints: A Margin-Based Approach” In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD08) (2008)

Y. C. Lin, C. W. Wu, and V. S. Tseng, "Mining high utility itemsets in big data," In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Ho Chi Minh City, Vietnam, pp. 649-661, 2015.

Sugato Basu, Mikhail Bilenko and Raymond J. Mooney “A Probabilistic Framework for Semi-Supervised Clustering”, In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), pp. 59-68, Seattle, WA, August 2004.

Antunes C., Oliveira A.L. (2005) Constraint Relaxations for Discovering Unknown Sequential Patterns. In: Goethals B., Siebes A. (eds) Knowledge Discovery in Inductive Databases. KDID 2004. Lecture Notes in Computer Science, vol 3377. Springer, Berlin, Heidelberg.

Sugato Basu, Arindam Banerjee, Raymond J. Mooney(2004), Active Semi-Supervision for Pairwise Constrained Clustering, In: Proceedings of the SIAM International Conference on Data Mining, (SDM-2004), pp. 333-344, Lake Buena Vista, FL, April, 2004

Baralis E., Cagliero L., Cerquitelli T., and Garza P. (2012). Generalized association rule mining with constraints. In: INFORMATION SCIENCES, vol. 194, pp. 68-84. - ISSN 0020-0255

Banerjee A and Ghosh J. “Data Min Knowl Disc (2006)” 13: 365. https://doi.org/10.1007/s10618-006-0040-z

Babaki B., Guns T., Nijssen S. (2014) Constrained Clustering Using Column Generation. In: Simonis H. (eds) Integration of AI and OR Techniques in Constraint Programming. CPAIOR 2014. Lecture Notes in Computer Science, vol 8451. Springer, Cham.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.