Integrating Web Usage and Content Mining for More Effective Personalization

Recent proposals have suggested Web usage mining as an enabling mechanism to overcome the problems associated with more traditional Web personalization techniques such as collaborative or content-based filtering. These problems include lack of scalability, reliance on subjective user ratings or static profiles, and the inability to capture a richer set of semantic relationships among objects (in content-based systems). Yet, usage-based personalization can be problematic when little usage data is available pertaining to some objects or when the site content changes regularly. For more effective personalization, both usage and content attributes of a site must be integrated into a Web mining framework and used by the recommendation engine in a uniform manner. In this paper we present such a framework, distinguishing between the offine tasks of data preparation and mining, and the online process of customizing Web pages based on a user's active session. We describe effective techniques based on clustering to obtain a uniform representation for both site usage and site content profiles, and we show how these profiles can be used to perform real-time personalization.

[1]  L. R. Rasmussen,et al.  In information retrieval: data structures and algorithms , 1992 .

[2]  Jonathan L. Herlocker,et al.  Clustering items for collaborative filtering , 1999 .

[3]  Philip S. Yu,et al.  SpeedTracer: A Web Usage Mining and Analysis Tool , 1998, IBM Syst. J..

[4]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[5]  Umeshwar Dayal,et al.  From User Access Patterns to Dynamic Hypertext Linking , 1996, Comput. Networks.

[6]  Anupam,et al.  Mining Web Access Logs Using Relational Competitive Fuzzy Clustering , 1999 .

[7]  Jaideep Srivastava,et al.  Creating adaptive Web sites through usage-based clustering of URLs , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[8]  Anupam Joshi,et al.  Extracting Web User Profiles Using Relational Competitive Fuzzy Clustering , 2000, Int. J. Artif. Intell. Tools.

[9]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[10]  Jiawei Han,et al.  Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[11]  Cyrus Shahabi,et al.  Knowledge discovery from users Web-page navigation , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[12]  Philip S. Yu,et al.  Data mining for path traversal patterns in a web environment , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[13]  Oren Etzioni,et al.  Adaptive Web Sites: Automatically Synthesizing Web Pages , 1998, AAAI/IAAI.

[14]  Maurice D. Mulvenna,et al.  Discovering Internet marketing intelligence through online analytical web usage mining , 1998, SGMD.

[15]  Michael D. Smith,et al.  Using Path Profiles to Predict HTTP Requests , 1998, Comput. Networks.

[16]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[17]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[18]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.