Open Access Open Access  Restricted Access Subscription or Fee Access

Mining Scalable Multidimensional Sequential User Access Logs Using Parallel Partitioning Transaction Reduction Algorithm

C. Thavamani, Dr. A. Rengarajan

Abstract


Web usage mining refers to the automatic discovery and analysis of patterns in click stream generated as a result of user interactions with Web resources on one or more Web sites. The primary data sources used in Web usage mining are the server log files, which include Web server access logs and application server logs. The web usage mining techniques are used to analyze the web usage patterns for a web site. The user access log is used to fetch the user access patterns. These patterns are preprocessed with many preprocessing methods like data fusion, data cleaning, session identification, exclusive user identification, page view identification, term view identification and path completion. To make the entire preprocessing faster, Hash map is used for its data organization. After preprocessing, it gives an isolated group of users with common interests. The complete preprocessing has done with the usage patterns stored in a web server access logs in order to provide clean, unique and reduced dataset for pattern mining. This automatically reduced the original size of dataset which makes it easier of pattern mining, analysis and increases the prediction accuracy. There are numerous pattern mining approaches which can be applied on purified data. The preprocessing practices will exploit the quality of pattern mining methodologies and the results can be used for recommended systems to find the behavior of a user. Key objective is to wide-ranging the above activities with high speed and achieve high prediction accuracy by concentrating on data preprocessing, discovery curious patterns and assessment.


Keywords


Common Interests, Server Log Files, Preprocessing, Hash Map, Exclusive User Identification, Page View Identification, Matrix Transaction Reduction.

Full Text:

PDF

References


Udayasri.B, Sushmitha.N, Padmavathi.S, “A LimeLight on the Emerging Trends of Web Mining” , Special Issue of InternationalJournal of Computer Science & Informatics (IJCSI), ISSN(PRINT):2231–5292,Vol.-II,Issue-1,2

“Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data” JaideepSrivastava * t, Robert Cooley:l: , MukundDeshpande, Pang-Ning Tan

“An effective data pre-processing method for Web Usage Mining”,IEEE, ISBN978-1-4673-5786-9 ,Page(s):7-10.

Tsuyoshi Murata, Kota Saito, “Extracting Users Interests from Web Log Data” Proceedings of 2006 IEEE/WIC/ACM International Conference of Web Intelligence (WI 2006 Main Conference Proceedings)(WI’06) 2006 IEEE.

R.M.Suresh, R.Padmajavalli, “An Overview of Data Preprocessing in Data and Web Usage Mining”, RMK Engineering College, Kavaraipettai, IEEE 2006.

Mirghani. A. Eltahir, Anour F.A. Dafa-Alla, Extracting Knowledge from Web Server Logs Using Web Usage Mining, International Conference on Computing, Electrical and Electronics Engineering (ICCEEE), page-no:413- 417, IEEE, 2013

Suneetha.K.R, Dr.R.Krishnamoorthi, “Data Preprocessing and Easy Access Retrieval of Data through Data Warehouse”, WCECS 2009, October 2009, 20-22, San Francisco, USA.

B.Mobasher, R.Cooley, J.Srivastava, “Automatic personalization based on Web usage mining”, Communications of the ACM, Vol-43 (8), 2000, 142-151.

R.Cooley, “Web Usage Mining: Discovery and Application if Interesting Patterns from Web data”, PhD thesis, Dept.of Computer Science, University of Minnesota, USA, 2000.

“Review and Analysis of Hashing Techniques”, International Journal of Advanced Research in Computer Science and Software Engineering, Vol-4, Issue-5, May 2014, 296-297.

P.Nithya, Dr.P.Sumathi, “An Enhanced Preprocessing Technique for Web Log Mining by Removing Web Robots”, Tamilnadu, IEEE 2012.

G.R.C. et al.,"An Efficient Preprocessing Methodology for Discovering Patterns and Clustering of Web Users using a Dynamic ART1 Neural Network," Fifth International Conference on Information Processing, 2011; SpringerVerlag.

”Research of an Algorithm Based on Web Usage Mining”,ShuyanBai; Vocational Coll. of Yantai, Qingtian Han; Qiming Liu; XiaoyanGaoIEEE, ISBN: 978-1-4244-3893-8, Page(s): 1-4.

Maheswara Rao et. al,"An Enhanced Pre-Processing Research Framework for web Log Data Using a Learning Algorithm," Computer Science and Information Technology, DOI, pp. 1-15, 2011.

Arvind K Sharma and P.C. Gupta, "Predicting the Behavior and Interest of the Website Users through Web Log Analysis”, International Journal of Computer Applications, Vol. 64, No. 7, February 2013.

Arjun Ram Meghwal, Dr.Arvind K Sharma, “Identifying System Errors through Web Server Log Files in Web Log Mining”, International Journal of Computer Science and Technology”, Vol-7, Issue-1, Jan-March 2016.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.