Constraint-Based Multidimensional Frequent Sequential Pattern in Web Usage Mining

S. Vijayalakshmi; Dr. V. Mohan; S. Suresh Raja

Constraint-Based Multidimensional Frequent Sequential Pattern in Web Usage Mining

S. Vijayalakshmi, Dr. V. Mohan, S. Suresh Raja

Abstract

Sequential Pattern Mining is one of the important approaches, which extracts frequent subsequences as pattern in a nSequence Database. Basic formulation of the frequent sequential pattern discovery problem assumes that the only constraint to be satisfied by discovered patterns is the minimum support threshold. Data mining systems should be able to exploit such constraints to speed-up the mining process. Though much work has been done in this area on one and two-dimensional database, mining sequential patterns from multidimensional database is yet on progress. In this paper we introduce an efficient strategy for discovering Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. This paper describes each of these phases in detail.

The main objective of multidimensional sequential pattern mining is to provide the end user with more useful and interesting patterns. To mine such kind of sequence data, we have used an extended version of the prefixspan(EXT-Prefixspan) algorithm to extract the Constraint-based multidimensional frequent sequential patterns in web usage mining. A web access pattern is a sequential pattern that is pursued frequently by users. Using these sequences as prefixes a projected database is constructed which is then recursively mined to find the frequent sequential patterns. The EXT-Prefixspan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Moreover, prefix –projection substantially reduces the size of projected database and leads to efficient processing. We show that the EXT-Prefixspan algorithm is more flexible at capturing desired knowledge than previous Algorithm.

Keywords

Data mining, Frequent Pattern mining, Sequence pattern mining, Web usage mining

Full Text:

PDF

References

R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Inkeri Verkamo. Fast discovery of association rules. In U. Fayyad and et al, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI Press, Menlo Park, CA, 1996.

R. Agrawal and R. Srikant. Mining sequential patterns. In 11th Intl. Conf. on Data Engg.,1995.

M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In International Conference on Distributed Computing Systems, 1996.

R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining world wide web browsing pattern. Knowledge and Information Systems, 1(1), 1999.

J. Pitkow. Summary of WWW characterizations. Computer Networks And ISDN Systems,30(1-7):551–558, April 1998.

J. Punin and M. Krishnamoorthy. WWWPal System - A System for Analysis and Synthesis of Web Pages. In Proceedings of the WebNet 98 Conference, Orlando, November 1998.

J. Punin and M. Krishnamoorthy. Log Markup Language (LOGML) Specification.http://www.cs.rpi.edu/_puninj/LOGML/draft logml.html, 2000. Status:INFORMATIONAL.

R. Kosala and H. Blockeel. Web mining research: A survey. SIGKDD Explorations, 2(1), June 2000.

H. Mannila, H. Toivonen, and I. Verkamo. Discovering frequent episodes in sequences. In 1st Intl. Conf. Knowledge Discovery and Data Mining, 1995.

B. Masand and M. Spiliopoulou, editors. Advances in Web Usage Mining and User Profiling:Proceedings of the WEBKDD’99 Workshop. Number 1836 in LNAI. Springer Verlag, July 2000.

R. Cooley, B. Mobasher, and J. Srivastava. Web Mining: Information and Pattern Discovery on the World Wide Web. In 8th IEEE Intl. Conf. on Tools with AI, 1997.

M. Spiliopoulou and L.C. Faulstich. WUM: A Tool for Web Utilization Analysis. In EDBT Workshop WebDB’98, LNCS 1590. Springer Verlag, March 1998.

R. Srikant and R. Agrawal. Mining generalized association rules. In 21st VLDB Conf., 1995.

S.Vijayalakshmi, Dr.V.Mohan and S.Suresh Raja.” Optimization of Constraint-Based Multidimensional Frequent Sequential Pattern In Web Usage Mining Using Association Rule Mining Techniques” in International conference of Data management [ICDM 2008], New Delhi.

E. Cohen, B. Krishnamurthy, and J. Rexford. Improving end-to-end performance of the web using server volumes and proxy filters. In Proc. ACM SIGCOMM, pages 241{253, 1998.

Data mining: Crossing the chasm, 1999. Invited talk at the 5th ACM SIGKDD Int'l Conference on Knowledge Discovery and Data Mining(KDD99).

Charu C Aggarwal and Philip S Yu. On disk caching of web objects in proxy servers. In CIKM 97, pages 238{245, Las Vegas, Nevada, 1997.

Martin F Arlitt and Carey L Williamson. Internet web servers: Workload characterization and performance implications. IEEE/ACM Transactions on Networking,5(5):631{645, 1997.

M. Balabanovic and Y. Shoham. Learning information retrieval agents: Experiments with automated web browsing. In On-line Working Notes of the AAAI Spring Symposium Series on Information Gathering from Distributed, Heterogeneous Environments, 1995.

Alex Buchner and Maurice D Mulvenna. Discovering internet marketing intelligence through online analytical web usage mining. SIGMOD Record, 27(4):54{61, 1998.

P. Batista, M. ario, and J. Silva, “Mining web access logs of an on-line newspaper,” 2002

O. R. Zaiane, M. Xin, and J. Han, “Discovering web access patterns andtrends by applying olap and data mining technology on web logs,” in ADL’98: Proceedings of the Advances in Digital Libraries Conference.Washington, DC, USA: IEEE Computer Society, 1998, pp. 1-19

J. F. F. M. V. M. Li Shen, Ling Cheng and T. Steinberg, “Mining the most interesting web access associations,” in WebNet 2000-World Conferenceon the WWW and Internet, 2000, pp. 489-494

J. Punin, M. Krishnamoorthy, and M. Zaki, “Web usage mining:Languages and algorithms,” in Studies in Classification, Data Analysis,and Knowledge Organization. Springer-Verlag, 2001

M. Eirinaki and M. Vazirgiannis, “Web mining for web personalization,”ACM Trans. Inter. Tech., Vol. 3, No. 1, pp. 1 27, 2003

J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu, “Mining access patterns efficiently from web logs,” in PADKK ’00: Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications. London, UK: Springer-Verlag, 2000, pp.396-407

R. Cooley, B. Mobasher, and J. Srivastava, “Data preparation for miningworld wide web browsing patterns,” Knowledge and Information Systems,Vol. 1, No. 1, pp. 5-32, 1999

J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan, “Web usage mining: Discovery and applications of usage patterns from web data,”SIGKDD Explorations, Vol. 1, No. 2, pp. 12-23, 2000

M. S. Chen, J. S. Park, and P. S. Yu, “Data mining for path traversal patterns in a web environment,” in Sixteenth International Conference on Distributed Computing Systems, 1996, pp. 385-392

A. Nanopoulos, D. Katsaros and Y. Manolopoulos, “A data mining algorithm for generalized web prefetching,” IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No. 5, pp. 1155-1169, 2003

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me