Feature Matrices: A Model for Efficient and Anonymous Web Usage Mining

Recent growth of startup companies in the area of Web Usage Mining is a strong indication of the effectiveness of this data in understanding user behaviors. However, the approach taken by industry towards Web Usage Mining is off-line and hence intrusive, static, and cannot differentiate between various roles a single user might play. Towards this end, several researchers studied probabilistic and distance-based models to summarize the collected data and maintain only the important features for analysis. The proposed models are either not flexible to trade-off accuracy for performance per application requirements, or not adaptable in real-time due to high complexity of updating the model. In this paper, we propose a new model, the FM model, which is flexible, tunable, adaptable, and can be used for both anonymous and on-line analysis. Also, we introduce a novel similarity measure for accurate comparison among FM models of navigation paths or cluster of paths. We conducted several experiments to evaluate and verify the FM model.

[1]  Myra Spiliopoulou,et al.  Web usage mining for Web site evaluation , 2000, CACM.

[2]  Huberman,et al.  Strong regularities in world wide web surfing , 1998, Science.

[3]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[4]  Cyrus Shahabi,et al.  Knowledge discovery from users Web-page navigation , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[5]  Oren Etzioni,et al.  Towards adaptive Web sites: Conceptual framework and case study , 2000, Artif. Intell..

[6]  Maurice D. Mulvenna,et al.  Personalization on the Net using Web mining: introduction , 2000, CACM.

[7]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[8]  Umeshwar Dayal,et al.  From User Access Patterns to Dynamic Hypertext Linking , 1996, Comput. Networks.

[9]  Cyrus Shahabi,et al.  Analysis and design of server informative WWW-sites , 1997, CIKM '97.