Data pre-processing on web server logs for generalized association rules mining algorithm

Web log file analysis began as a way for IT administrators to ensure adequate bandwidth and server capacity on their organizations website. Log file data can offer valuable insight into web site usage.It reflects actual usage in natural working condition, compared to the artificial setting of a usability lab.It represents the activity of many users, over potentially long period of time, compared to a limited number of users for an hour or two each.This paper describes the pre-processing techniques on IIS Web Server Logs ranging from the raw log file until before mining process can be performed. Since the pre-processing is tedious process, it depending on the algorithm and purposes of the applications.