Open Access Open Access  Restricted Access Subscription or Fee Access

Multidimensional Database Model for Web Content Mining

Vikrant Sabnis, R.S. Thakur

Abstract


With increase in network technologies and number of users working on the network, attempts are being made to discover the useful knowledge from the secondary data. For retrieving knowledge large number of models, techniques and methods are evolving continuously in the area of web content mining. These techniques are becoming very critical for effective management of web sites in the variety of domains such as business, education and e-learning. Based on the prediction approach the user browsing behaviors can be guessed and this information can be utilized for building of proper web sites. This paper proposes star schema for web contents mining from the complex data which is multidimensional in nature. Further the association among web contents is explored using multidimensional ARM approach to know the surfing behavior of web users. At the end Performance computation of proposed work has been discussed, which shows improvement in the gain and implementation explains well the significance of multidimensional association rule in web content data. The paper also compares pros and cons with the traditional state of art approaches.

Keywords


Web Content Mining, Data Mining, Pattern Discovery

Full Text:

PDF

References


J. han and M. Kamber, data mining: concepts and techniques 2rd ed., Beijing: higher education Press, 2006, pp. 255 – 242. .

Yang Lin and Mao Yurong, “Personalization” Customize Your Network Service”, Software Engineer, V 7, 2003.

M. Agosti, G.M. Di Nunzio, A. Niero,” From Web Log Analysis to Web user Profiling”, In DELOS Conference 2007. Working Notes. Pisa, Italy, 2007, pp121-132.

B. Berendt, B. Mobasher, M. Nakagawa, M. Spiliopoulou, “The Impact of Site Structure and User Environment on Session Re-Construction in Web Usage Analysis”, WEBKDD 2002, LNAI 2703, pp 159 – 179, 2003.

F. M. Facca, P.L. Lanzi, “Mining interesting knowledge from Weblogs: a survey”, Data and knowledge Engineering Vol. 53, o. 3 June 2005, pp 225-241.

H. Sug, Discovery of Multidimensional Association Rules Focusing on Instances in Specific Class: International Journal Of Mathematics And Computers In Simulation, Issue 3, Volume 5, 2011.

Swami S., Thakur R. S., Chandel R. S.:“Multidimensional Association rules Extraction in smoking habits database”, International Journal of Advanced Networking and Applications (IJANA), volume 3, Nov. 2011.

R. Agrawal; T. Imielinski; A. Swami: Mining Association Rules Between Sets of Items in Large Databases", SIGMOD Conference 1993: 207-216.

D.S. Rajput, R.S. Thakur, G.S. Thakur, “Rule Generation from textual data by using graph based approach”, International Journal of Computer Application (IJCA) 2011 pp 36-43.

D.S. Rajput, R.S. Thakur, G.S. Thakur, “Fuzzy association rule mining based frequent pattern extraction from uncertain data” Information and Communication Technologies (WICT), 2012 World Congress on Oct. 30 2012-Nov. 2 2012 Pp-709-714.

Qingyu Zhang and Richard S. Segall,” Web mining: a survey of current research,Techniques, and software”, in the International Journal of Information Technology & Decision Making Vol. 7, No. 4 (2008)683–720.

Juan Vel´asquez, Hiroshi Yasuda and Terumasa Aoki, “Combining the web content and usage mining to understand the visitor behavior in a web site” Proceedings of the Third IEEE International Conference on Data Mining (ICDM’03) 2003 IEEE.

Camelia Elena CIOLAC, Florica LUBAN, Răzvan Cătălin DOBREA, “Web Content Mining Framework for Discovering University Formations’ Compatibility with the Market Needs” Review of International Comparative Management, Volume 11, Issue 5, December,2010.

http://www.infovis.net/printMag.php?lang=2&num=172

Zhang, Lakshmanan, and Zamar, “Extracting Relational Data from HTML Repositories”, SIGKDD Explorations Volume 6, Issue 2, 2004.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.