Open Access Open Access  Restricted Access Subscription or Fee Access

A Novel Model for Web Mining

Monalisa Panda, Pradosh Ranjan Parida, Umesh Chandra Mishra, Siba Prasada Panigrahi

Abstract


Aim of this paper is to propose a novel Web usage mining approach. The proposed approach is based on Latent Dirichlet
Allocation (LDA) model. The experimental results on the selected
usage data set have shown that the proposed LDA-based Web usage mining is capable of revealing the latent task space and generating the user session clusters with better quality in comparison with other conventional Latent Semantic Analysis (LSA) based approaches.


Keywords


Web Usage Mining, LSA, LDA

Full Text:

PDF

References


Hofmann, T. Probabilistic Latent Semantic Analysis. in Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval. 1999, p. 50-57, Berkeley, California, USA: ACM Press.

Zhou, Y., X. Jin, and B. Mobasher. A Recommendation Model Based on Latent Principal Factors in Web Navigation Data. in Proceedings of the 3rd International Workshop on Web Dynamics. 2004, New York: ACM Press.

Song, Y., et al. Efficient Topic-based Unsupervised Name

Disambiguation. in Joint Conference in Digital Library 2007. 2007, p.342-351, Vancouver, British Columbia, Canada.

Ma, J., Y. Zhang, and J. Cao. A Probabilistic Semantic Approach for Discovering Web Services. in Proceedings of WWW2007. 2007, p.1221-1222, Banff, Alberta, Canada.

Blei, D.M., A.Y. Ng, and M.I. Jordan, Latent Dirichlet Allocation. Journal of Machine Learning Research, 2003(3): p. 993-1022.

Wei, X. and W.B. Croft. LDA-Based Document Models for Ad-hoc Retrieval. In Proceedings of SIGIR'06 2006, p. 178-185, Seattle, Washington, USA.

Li, F.-F. and P. Perona. A Bayesian Hierarchical Model for Learning Natural Scene Categories. in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR'05). 2005, p. 524 - 531, San Diego, CA, USA.

Wang, X. and A. McCallum. Topics over Time: A NonMarkov ContinuousTime Model of Topical Trends. in Proceedings of ACM SIGKDD. 2006, p. 424-433, Philadelphia, Pennsylvania, USA.

Jin, X., Y. Zhou, and B. Mobasher. A Unified Approach to

Personalization Based on Probabilistic Latent Semantic Models of Web Usage and Content. In Proceedings of the AAAI 2004 Workshop on Semantic Web Personalization (SWP'04). 2004, San Jose.

Elango, P.K. and K. Jayaraman, Clustering Images Using the Latent Dirichlet Allocation Model

(http://pages.cs.wisc.edu/~pradheep/Clust-LDA.pdf ). 2005.

Mobasher, B., et al., Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization. Data Mining and Knowledge Discovery, 2002. 6(1): p. 61-82.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.