Collecting user access patterns for building user profiles and collaborative filtering

The paper proposes a new learning mechanism to extract user preferences transparently for a World Wide Web recommender system. The general idea is that we use the entropy of the page being accessed to determine its interestingness based on its occurrence probability following a sequence of pages accessed by the user. The probability distribution of the pages is obtained by collecting the access patterns of users navigating on the Web. A finite context-model is used to represent the usage information. Based on our proposed model, we have developed an autonomous agent, named ProfBuilder, that works as an online recommender system for a Web site. ProfBuilder uses the usage information as a base for content-based and collaborative filtering.

[1]  Massimo Marchiori,et al.  The Quest for Correct Information on the Web: Hyper Search Engines , 1997, Comput. Networks.

[2]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[3]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[4]  Hector Garcia-Molina,et al.  SIFT - a Tool for Wide-Area Information Dissemination , 1995, USENIX.

[5]  Christopher J. Fox,et al.  A stop list for general text , 1989, SIGF.

[6]  Bruce Krulwich,et al.  The InfoFinder Agent: Learning User Interests through Heuristic Phrase Extraction , 1997, IEEE Expert.

[7]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[8]  Thorsten Joachims,et al.  WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[9]  Tomonari Kamba,et al.  Learning Personal Preferences on Online Newspaper Articles from User Behaviors , 1997, Comput. Networks.

[10]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[11]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[12]  William B. Frakes Term Conflation for Information Retrieval , 1984, SIGIR.

[13]  Xindong Wu,et al.  SiteHelper: A Localized Agent That Helps Incremental Exploration of the World Wide Web , 1997, Comput. Networks.

[14]  Tim Bray,et al.  Measuring the Web , 1996, World Wide Web J..

[15]  Michel Jaczynski,et al.  BROADWAY: A World Wide Web Browsing Advisor Reusing Past Navigations from a Group of Users , 1998 .

[16]  Michael J. Pazzani,et al.  Syskill & Webert: Identifying Interesting Web Sites , 1996, AAAI/IAAI, Vol. 1.

[17]  Vijay V. Raghavan,et al.  Information Retrieval on the World Wide Web , 1997, IEEE Internet Comput..