Rational Computation for Mining Association Rules from XML Documents

T. Sandhiya; M.S. Saravanan

Rational Computation for Mining Association Rules from XML Documents

T. Sandhiya, M.S. Saravanan

Abstract

An approach is proposed based on Tree-based Association Rules (TARs) mined rules, which provide approximate, intensional information on both the structure and the contents of XML documents, and can be stored in XML format as well. This mined knowledge is later used to provide: (i) a concise idea – the gist – of both the structure and the content of the XML document and (ii) quick, approximate answers to queries. This project presents a new database model which is to store the large volume of data. We are going to use xml database and search in that xml database using any keyword. That search can be performed by search for node and going to use ranking for individual matches and reduce the search intentions. This xml database can store large volume of data and user can search the detail effectively.

Keywords

XML, Approximate Query-Answering, Data Mining, Intensional Information

Full Text:

PDF

References

R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. of the 20th Int. Conf. on Very Large Data Bases, pages 487–499. Morgan Kaufmann Publishers Inc., 1994.

T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Sakamoto, and S. Arikawa. Efficient substructure discovery from large semi-structured data. In Proc. of the SIAM Int. Conf. on Data Mining, 2002.

T. Asai, H. Arimura, T. Uno, and S. Nakano. Discovering frequent substructures in large unordered trees. In Technical Report DOI-TR 216, Department of Informatics, Kyushu University. http://www.i.kyushu-u.ac.jp/doitr/trcs216.pdf , 2003.

E. Baralis, P. Garza, E. Quintarelli, and L. Tanca. Answering xml queries by means of data summaries. ACM Transactions on Information Systems, 25(3):10, 2007.

D. Barbosa, L. Mignet, and P. Veltri. Studying the xml web: Gathering statistics from an xml sample. World Wide Web , 8(4):413–438, 2005.

D. Braga, A. Campi, S. Ceri, M. Klemettinen, and P. Lanzi. Discovering interesting information in xml data with association rules. In Proc. Of the ACM Symposium on Applied Computing, pages 450–454, 2003.

Y. Chi, Y. Yang, Y. Xia, and R. R. Muntz. Cmtreeminer: Mining both closed and maximal frequent subtrees. In Proc. of the 8th Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pages 63–73, 2004.

C. Combi, B. Oliboni, and R. Rossato. Querying xml documents by using association rules. In Proc. of the 16th Int. Conf. on Database and Expert Systems Applications, pages 1020–1024, 2005.

A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. In Proc. of the 8th ACM Int. Conf. on Knowledge Discovery and Data Mining , pages 217–228, 2002.

L. Feng, T. S. Dillon, H. Weigand, and E. Chang. An xml-enabled association rule framework. In Proc. of the 14th Int. Conf. on Database and Expert Systems Applications, pages 88–97, 2003.

S. Gasparini and E. Quintarelli. Intensional query answering to xquery expressions. In Proc. of the 16th Int. Conf. on Database and Expert Systems Applications , pages 544–553, 2005.

B. Goethals and M. J. Zaki. Advances in frequent itemset mining implementations: report on FIMI’03. SIGKDD Explorations, 6(1):109–117, 2004.

R. Goldman and J. Widom. Dataguides: Enabling query formulationand optimization in semistructured databases. InProc. of the 23rd Int. Conf. on Very Large Data Bases, pages 436–445, 1997.

R. Goldman and J. Widom. Approximate DataGuides. In Proc. Ofthe Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats , pages 436–445, 1999.

A. Inokuchi, T. Washio, and H. Motoda. Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning, 50(3):321– 354, 2003.

A. Jim ´enez, F. Berzal, and J. C. Cubero. Mining induced and embeddedsubtrees in ordered, unordered, and partially-ordered trees. In Proc. Of the 17th Int. Symposium on Methodologies for Intelligent Systems , pages111–120, 2008.

D. Katsaros, A. Nanopoulos, and Y. Manolopoulos. Fast mining of frequent tree structures by hashing and indexing. Information & Software Technology , 47(2):129–140, 2005.

M. Kuramochi and G. Karypis. An efficient algorithm for discover-ing frequent subgraphs. IEEE Transactions on Knowledge and Data Engineering , 16(9):1038–1051, 2004.

H. C. Liu and J. Zeleznikow. Relational computation for mining association rules from xml data. In Proc. of the 14th ACM Conf. on Information and Knowledge Management, pages 253–254, 2005.

Gary Marchionini. Exploratory search: from finding to understanding.Communications of the ACM , 49(4):41–46, 2006.

M. Mazuran, E. Quintarelli, and L. Tanca. Mining tree-based association rules from xml documents. In Technical Report, Politecnico di Milano. http://home.dei.polimi.it/quintare/Papers/MQT09-RR.pdf , 2009.

M. Mazuran, E. Quintarelli, and L. Tanca. Mining tree-based frequent patterns from xml. In Proc. of the 8th Int. Conf. on Flexible Query Answering Systems , pages 287–299, 2009.

S. Nijssen and J.N. Kok. Efficient discovery of frequent unordered trees. In Proc. of the 1st Int. Workshop on Mining Graphs, Trees and Sequences, 2003.[24] J. Paik, H. Y. Youn, and U. M. Kim. A new method for miningassociation rules from a collection of xml documents. In Proc. of Int. Conf. on Computational Science and Its Applications, pages 936–945, 2005.

A. Termier, M. Rousset, and M. Sebag. Dryade: A new approach fordiscovering closed frequent trees in heterogeneous tree databases. In Proc. of the 4th IEEE Int. Conf. on Data Mining , pages 543–546, 2004.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me