Predicting user behavior through sessions using the web log mining

It is the method to extract the user sessions from the given log files. Initially, each user is identified according to his/her IP address specified in the log file and corresponding user sessions are extracted. Two types of logs ie., server-side logs and client-side logs are commonly used for web usage and usability analysis. Server-side logs can be automatically generated by web servers, with each entry corresponding to a user request. Client-side logs can capture accurate, comprehensive usage data for usability analysis. Usability is defined as the satisfaction, efficiency and effectiveness with which specific users can complete specific tasks in a particular environment. This process includes 3 stages, namely Data cleaning, User identification, Session identification. In this paper, we are implementing these three phases. Depending upon the frequency of users visiting each page mining is performed. By finding the session of the user we can analyze the user behavior by the time spend on a particular page.

[1]  Sebastián Ventura,et al.  Applying Web usage mining for personalizing hyperlinks in Web-based adaptive educational systems , 2009, Comput. Educ..

[2]  Veer Sain Dixit,et al.  Refinement and evaluation of web session cluster quality , 2015, Int. J. Syst. Assur. Eng. Manag..

[3]  Jie Lu,et al.  Web-Page Recommendation Based on Web Usage and Domain Knowledge , 2014 .

[4]  Pablo E. Román,et al.  Identifying web sessions with simulated annealing , 2014, Expert Syst. Appl..

[5]  Keng Siau,et al.  Health care informatics , 2003, IEEE Transactions on Information Technology in Biomedicine.

[6]  Yang Song,et al.  Task Trail: An Effective Segmentation of User Search Behavior , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Wahidah Husain A Study of Customer Behaviour Through Web Mining , 2015 .

[8]  L. Panigrahy,et al.  Web Usage Mining: A Survey on Pattern Extraction from Web Logs , 2011 .

[9]  Rajeev Raman,et al.  Mining sequential patterns from probabilistic databases , 2011, Knowledge and Information Systems.

[10]  Vishal Mahajan,et al.  Usage patterns discovery from a web log in an Indian e-learning site: A case study , 2014, Education and Information Technologies.

[11]  Alexandra I. Cristea,et al.  Entropy-based automated wrapper generation for weblog data extraction , 2014, World Wide Web.

[12]  Jeff Tian,et al.  Improving Web Navigation Usability by Comparing Actual and Anticipated Usage , 2015, IEEE Transactions on Human-Machine Systems.

[13]  Yong Sun,et al.  EPLogCleaner: Improving Data Quality of Enterprise Proxy Logs for Efficient Web Usage Mining , 2013, ITQM.

[14]  Sireesha Rodda,et al.  An Overview on Web Usage Mining , 2015 .