Personalized web search by mapping user queries to categories

Current web search engines are built to serve all users, independent of the needs of any individual user. Personalization of web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to map a user query to a set of categories, which represent the user's search intention. This set of categories can serve as a context to disambiguate the words in the user's query. A user profile and a general profile are learned from the user's search history and a category hierarchy respectively. These two profiles are combined to map a user query into a set of categories. Several learning and combining algorithms are evaluated and found to be effective. Among the algorithms to learn a user profile, we choose the Rocchio-based method for its simplicity, efficiency and its ability to be adaptive. Experimental results indicate that our technique to personalize web search is both effective and efficient.

[1]  Yoav Shoham,et al.  Learning Information Retrieval Agents: Experiments with Automated Web Browsing , 1995 .

[2]  Stephen E. Robertson,et al.  The TREC-8 Filtering Track Final Report , 1999, TREC.

[3]  Kristian J. Hammond,et al.  Watson: Anticipating and Contextualizing Information Needs , 1999 .

[4]  Guijun Wang,et al.  ProFusion*: Intelligent Fusion from Multiple, Distributed Search Engines , 1996, J. Univers. Comput. Sci..

[5]  Divyakant Agrawal,et al.  Using Automated Classification for Summarizing and Selecting Heterogeneous Information Sources , 1998, D Lib Mag..

[6]  Daphne Koller,et al.  Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.

[7]  L. R. Rasmussen,et al.  In information retrieval: data structures and algorithms , 1992 .

[8]  Luis Gravano,et al.  Probe, count, and classify: categorizing hidden web databases , 2001, SIGMOD '01.

[9]  Gene H. Golub,et al.  Matrix Computations, Third Edition , 1996 .

[10]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[11]  James Allan,et al.  Incremental relevance feedback for information filtering , 1996, SIGIR '96.

[12]  T. Joachims WebWatcher : A Tour Guide for the World Wide Web , 1997 .

[13]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[14]  Alexander Pretschner,et al.  Ontology based personalized search , 1999, Proceedings 11th International Conference on Tools with Artificial Intelligence.

[15]  William P. Birmingham,et al.  Improving category specific Web search by learning query modifications , 2001, Proceedings 2001 Symposium on Applications and the Internet.

[16]  John Yen,et al.  An adaptive algorithm for learning changes in user interests , 1999, CIKM '99.

[17]  Hector Garcia-Molina,et al.  SIFT - a Tool for Wide-Area Information Dissemination , 1995, USENIX.

[18]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[19]  Susan T. Dumais,et al.  Personalized information delivery: an analysis of information filtering methods , 1992, CACM.

[20]  C. Lee Giles,et al.  A system for automatic personalized tracking of scientific literature on the Web , 1999, DL '99.

[21]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.

[22]  Luis Gravano,et al.  Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies , 1995, VLDB.

[23]  Gene H. Golub,et al.  Matrix computations , 1983 .

[24]  Adele E. Howe,et al.  SAVVYSEARCH: A Metasearch Engine That Learns Which Search Engines to Query , 1997, AI Mag..

[25]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[26]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[27]  Ophir Frieder,et al.  Information Retrieval: Algorithms and Heuristics , 1998 .

[28]  Clement T. Yu,et al.  Concept Hierarchy-Based Text Database Categorization , 2002, Knowledge and Information Systems.

[29]  King-Lup Liu,et al.  Efficient and effective metasearch for text databases incorporating linkages among documents , 2001, SIGMOD '01.

[30]  Yiming Yang,et al.  Noise reduction in a statistical approach to text categorization , 1995, SIGIR '95.

[31]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[32]  Thorsten Joachims,et al.  Web Watcher: A Tour Guide for the World Wide Web , 1997, IJCAI.

[33]  Katia P. Sycara,et al.  WebMate: a personal agent for browsing and searching , 1998, AGENTS '98.

[34]  Dayne Freitag,et al.  WebWatcher : A Tour Guide for the World , 1996 .

[35]  James C. French,et al.  The impact of database selection on distributed searching , 2000, SIGIR '00.

[36]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[37]  Timothy W. Finin,et al.  Yahoo! as an ontology: using Yahoo! categories to describe documents , 1999, CIKM '99.

[38]  Yiming Yang,et al.  An example-based mapping method for text categorization and retrieval , 1994, TOIS.

[39]  C. Lee Giles,et al.  Self-adaptive user profiles for large-scale data delivery , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).