Preprocessing the web server logs: an illustrative approach for effective usage mining

Data preprocessing is an important activity for discovering behavioral patterns. The analysis of web logs is an essential task for System Administrators to safeguard adequate bandwidth and to maintain server capacity on their business websites. A web Log file represents user activities occurring over a period of time. Web log files offer valuable insight into the effective usage of the web site. It helps maintain an account of the actual usage in a regular working system as compared to the virtual setting of a usability lab. This research paper focuses on the preprocessing techniques implemented on a specially designed Web Sift (WebIS) tool on an IIS web server and also proposes some efficient heuristics and techniques

[1]  Donna L. Hoffman,et al.  New metrics for new media: toward the development of Web measurement standards , 1997, World Wide Web J..

[2]  Victor Ciesielski,et al.  Data Mining of Web Access Logs From an Academic Web Site , 2003, HIS.

[3]  Susan Haigh,et al.  Measuring Web Site Usage: Log File Analysis , 1998 .

[4]  M. Carl Drott Using Web server logs to improve site design , 1998, SIGDOC '98.

[5]  Shichao Zhang,et al.  Identifying interesting visitors through Web log classification , 2005, IEEE Intelligent Systems.

[6]  Chien-Chung Chan,et al.  Active User-Based and Ontology-Based Web Log Data Preprocessing for Web Usage Mining , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[7]  Pramudiono Iko Parallel platform for large scale Web usage mining , 2004 .

[8]  Tsuyoshi Murata,et al.  Extracting Users' Interests from Web Log Data , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[9]  Brian D. Davison Web Traffic Logs: An Imperfect Resource for Evaluation , 1999 .