Open Access Open Access  Restricted Access Subscription or Fee Access

Measurement of Distance from Page Sequences Using Dynamic Programming

Brijesh Bakariya, Ghanshyam Singh Thakur

Abstract


Internet is playing a vital role for accessing information, because lots of information is available on internet. Lots of data are rapidly growing, but the data which is resided on the web include irrelevant information, it contains different types of data format. Due to heterogeneity of data it is very challenging task to retrieve relevant information from web data. Using web usage mining technique, mine the relevant information from large amount of data available in the web logs format that enclose intrinsic information regarding web pages accessed. Because of this large amount of web log data, it is better to deal with small set of data at a time, instead of handling with whole data jointly. Now we need to find the distance between two user sessions, using some distance similarity function can be accomplish this kind of tasks. Clustering of users tends to establish groups of users exhibiting similar browsing patterns. In this paper we propose novel algorithm, for measuring the similarity between two user sessions based on sequence alignment that uses the Longest Common Subsequence method.

Keywords


Clustering, Longest Common Subsequence, Web Logs, Web Usage Mining.

Full Text:

PDF

References


G. T. Raju and P. S. Satyanarayana, "Knowledge Discovery from Web Usage Data: Complete Preprocessing Methodology" IJCSNS International Journal of Computer Science and Network Security, pp. 473-480, June 1996.

N. Khasawneh and C. Chan , "Active user-based and ontology-based web log data preprocessing for web usage mining" EEE/WIC/ACM International Conference on Web Intelligence, 2006.

Vladan Devedzic, "Semantic Web and Education” Springer, 2006.

Jitian Xiao, Yanchun Zhang Xiaohua Jia and Tianzhu Li, "Measuring Similarity of Interests for Clustering Web-Users" Database Conference, 2001. ADC 2001. Proceedings. 12th Australasian, 2001.

P. Reiners, "Dynamic Programming and Sequence Alignment” www.ibm.com/developerwork, 2008.

W. Wang and O.R. Zaiane, "Clustering Web Sessions by Sequence Alignment" Proceedings of the 13th International Workshop on Database and Expert systems Applications (DEXA), 2002.

Jyoti ,A. K. Sharma, Amit Goel and Payal gulati, "A Novel Approach for clustering web user sessions using RST" International Conference on Advances in Computing, Control, and Telecommunication Technologies, 2009.

C. Shahabi,A. Zarkesh, J. Adibi and V. Shah, "Knowledge discovery from users web-page navigation" workshop on Research Issues in Data Engineering, England, 1997.

Weinan Wang and Osmar R. Zaiane, "Clustering Web Sessions by Sequence Alignment" Proceedings of the 13th International Workshop on Database and Expert Systems Applications, IEEE Computer Society Washington, 2002.

M. Spiliopoulou and L.C. Faulstich,"WUM : A Web Utilization Miner" EDBT Workshop WebDB98, Valencia,Spain, Springer, 1998.

D. Gusfield, "Algorithms on Strings, Trees, and Sequences – Computer Science and Computational Biology" Cambridge University Press. 1997.

Y. Fu,K. Sandhu and M.Y.Shih, "Clustering of web users based on access patterns" WEBKDD workshop, 1999.

G. Poornalatha and Raghavendra Prakash, "Alignment Based Similarity distance Measure for Better Web Sessions Clustering" Procedia Computer Science, 2011.

Jose Luis Ortega and Isidro Aguillo, "Differences between web sessions according to the origin of their visits” Journal of Informetrics, pp. 331-337, 2010.

Nur Aini ,Abdul Rashid, Rosni Abdullahl, Abdullah Zawawi Haji Talibl and Zalila Ali, "Fast Dynamic Programming Based Sequence Alignment Algorithm" IEEE, 2006.

Mozhgan Azimpour-Kivi and Reza Azmi, "A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment" IEEE, 2011.

Xiaowei Li and Yuan Xue and Bradley Malin, "Detecting Anomalous User Behaviors in Workflow-driven Web Applications" 31st International Symposium on Reliable Distributed Systems, 2012.

Bhupendra S Chordia and Krishnakant P Adhiya, "GROUPING WEB ACCESS SEQUENCES USING SEQUENCE ALIGNMENT METHOD" Indian Journal of Computer Science and Engineering (IJCSE), 2011.

FASTA and BLAST, http://en.wikipedia.org/wiki/FASTA

Zheng Lu, Hongyuan Zha, Xiaokang Yang, Weiyao Lin and Zhaohui Zheng, “A New Algorithm for Inferring User Search Goals with Feedback

Sessions” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013.

Tsuyoshi Murata and Kota Saito, "Extracting Users' Interests from Web Log Data” Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, 2006.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.