Data Preprocessing in Web Usage Mining

At present, the study on Web Usage Mining mainly focuses on pattern discovery (including Association Rules, sequence pattern, etc) and pattern analysis. However, the study on the main data sources, that is to say, the study on web-log pre-process is relatively rare. Given that high-quality data helps a lot in improving Pattern mining precision, this paper studies from this aspects, and proposes the high-effective data preprocessing method.

[1]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[2]  Chen Junjie Design for Web Usage Mining Model , 2007 .

[3]  Ramana Rao,et al.  Silk from a sow's ear: extracting usable structures from the Web , 1996, CHI.

[4]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[5]  Xu Miao Study on Text Mining on Web , 2003 .