Introducing Query Expansion Methods for Collaborative Information Retrieval

The accuracy of ad-hoc document retrieval systems has plateaued in the last few years. At DFKI, we are working on so-called collaborative information retrieval (CIR) systems which unobtrusively learn from their users’ search processes. We focus on a restricted setting in CIR in which only old queries and correct answer documents to these queries are available for improving a new query. For this restricted setting we propose new approaches for query expansion procedures. This paper describes query expansion methods to be used in collaborative information retrieval. We define collaborative information retrieval as a task, where an information retrieval system uses information gathered from previous search processes from one or several users to improve retrieval performance for the current user searching for information. We show how collaboration of individual users can improve overall information retrieval performance. Performance in this case is expressed in terms of quality and utility of the retrieved information regardless of specific user groups.

[1]  Ji-Rong Wen,et al.  Query clustering using user logs , 2002, TOIS.

[2]  Jitender S. Deogun,et al.  Optimal Queries in Information Filtering , 2000, ISMIS.

[3]  Karen Spärck Jones Automatic term classification and information retrieval , 1968, IFIP Congress.

[4]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[5]  Elizabeth R. Jessup,et al.  Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..

[6]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[7]  Karen Spärck Jones,et al.  Automatic term classifications and retrieval , 1968, Inf. Storage Retr..

[8]  Donna K. Harman,et al.  Relevance feedback revisited , 1992, SIGIR '92.

[9]  Fabio Crestani,et al.  Probability Kinematics in Information Retrieval a case study , 1995 .

[10]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[11]  C. J. van Rijsbergen,et al.  A Non-Classical Logic for Information Retrieval , 1997, Comput. J..

[12]  Fabio Crestani,et al.  A study of probability kinematics in information retrieval , 1998, TOIS.

[13]  Donna K. Harman,et al.  Overview of the Ninth Text REtrieval Conference (TREC-9) , 2000, TREC.

[14]  Markus Junker,et al.  Query Reformulation in Collaborative Information Retrieval , 2002 .

[15]  Ingrid Renz,et al.  Text Mining, Theoretical Aspects and Applications , 2002 .

[16]  A. Copeland Review: John von Neumann and Oskar Morgenstern, Theory of games and economic behavior , 1945 .

[17]  Tomas Olsson Information Filtering with Collaborative Interface Agents , 1998 .

[18]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[19]  Ellen M. Voorhees,et al.  The Tenth Text REtrieval Conference, TREC 2001 | NIST , 2002 .

[20]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[21]  Markus Junker,et al.  Towards Collaborative Information Retrieval: Three Approaches , 2003, Text Mining.

[22]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[23]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[24]  Fabio Crestani,et al.  Information Retrieval by Logical Imaging , 1995, J. Documentation.

[25]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.

[26]  Donna K. Harman,et al.  Relevance Feedback and Other Query Modification Techniques , 1992, Information retrieval (Boston).

[27]  Markus Junker,et al.  Query Expansion for Web Information Retrieval , 2002, GI Jahrestagung.

[28]  C. J. van Rijsbergen,et al.  Information Retrieval by Imaging , 1994 .

[29]  Tamara G. Kolda,et al.  Limited-memory matrix methods with applications , 1997 .

[30]  Ryen W. White,et al.  The Use of Implicit Evidence for Relevance Feedback in Web Retrieval , 2002, ECIR.

[31]  Kwok-Wai Cheung,et al.  Learning User Similarity and Rating Style for Collaborative Recommendation , 2003, Information Retrieval.

[32]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[33]  Markus Junker,et al.  Experimental evaluation of passage-based document retrieval , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[34]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[35]  Richard A. Harshman,et al.  Information retrieval using a singular value decomposition model of latent semantic structure , 1988, SIGIR '88.

[36]  Reginald Ferber,et al.  Information Retrieval - Suchmodelle und Data-Mining-Verfahren für Textsammlungen und das Web , 2003 .

[37]  Fabio Crestani,et al.  Probability kinematics in information retrieval , 1995, SIGIR '95.

[38]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[39]  Donna K. Harman,et al.  Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[40]  Gerard Salton,et al.  Improving Retrieval Performance by Relevance Feedback , 1997 .

[41]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[42]  Markus Junker,et al.  Passage-Based Document Retrieval as a Tool for Text Mining with User's Information Needs , 2001, Discovery Science.

[43]  Hayri Sever Knowledge structuring for database mining and text retrieval using past optimal queries , 1995 .

[44]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[45]  Tapio Elomaa,et al.  Machine Learning: ECML 2002 , 2002, Lecture Notes in Computer Science.

[46]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[47]  Thorsten Joachims,et al.  Unbiased Evaluation of Retrieval Quality using Clickthrough Data , 2002 .

[48]  C. J. van Rijsbergen,et al.  Towards an information logic , 1989, SIGIR '89.

[49]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[50]  Shusaku Tsumoto,et al.  Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[51]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[52]  Vijay V. Raghavan,et al.  On the reuse of past optimal queries , 1995, SIGIR '95.

[53]  M. Butler Information Retrieval Systems Characteristics, Testing, and Evaluation , 1970 .

[54]  Markus Junker,et al.  Collaborative Learning of Term-Based Concepts for Automatic Query Expansion , 2002, ECML.

[55]  Brian Vickery Donald Urquhart, 1909-1994 , 1995, J. Documentation.

[56]  Markus Junker,et al.  Improving Document Retrieval by Automatic Query Expansion Using Collaborative Learning of Term-Based Concepts , 2002, Document Analysis Systems.

[57]  Daniel P. Lopresti,et al.  Document Analysis Systems V , 2002, Lecture Notes in Computer Science.

[58]  Ellen M. Voorhees,et al.  The Eighth Text REtrieval Conference (TREC-8) , 2000 .

[59]  Tamara G. Kolda,et al.  A semidiscrete matrix decomposition for latent semantic indexing information retrieval , 1998, TOIS.

[60]  Luis Gravano,et al.  Learning search engine specific query transformations for question answering , 2001, WWW '01.

[61]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[62]  Ellen M. Voorhees,et al.  The Ninth Text REtrieval Conference (TREC-9) , 2001 .

[63]  Jack Minker,et al.  An evaluation of query expansion by the addition of clustered terms for a document retrieval system , 1972, Inf. Storage Retr..