Automatic Query Expansion Using SMART: TREC 3

The Smart information retrieval project emphasizes completely automatic approaches to the understanding and retrieval of large quantities of text. We continue our work in TREC 3, performing runs in the routing, ad-hoc, and foreign language environments. Our major focus is massive query expansion : adding from 300 to 530 terms to each query. These terms come from known relevant documents in the case of routing, and from just the top retrieved documents in the case of ad-hoc and Spanish. This approach improves effectiveness from 7% to 25% in the various experiments. Other ad-hoc work extends our investigations into combining global similarities, giving an overall indication of how a document matches a query, with local similarities identifying a smaller part of the document which matches the query. Using an overlapping text window definition of local, we achieve a 16% improvement.

[1]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[2]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[3]  James Allan,et al.  Automatic Routing and Ad-hoc Retrieval Using SMART: TREC 2 , 1993, TREC.

[4]  David A. Evans,et al.  Design and Evaluation of the CLARIT-TREC-2 System , 1993, TREC.

[5]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[6]  W. Bruce Croft,et al.  An evaluation of query processing strategies using the TIPSTER collection , 1993, SIGIR.

[7]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[8]  Gerard Salton,et al.  Automatic text structuring and retrieval-experiments in automatic encyclopedia searching , 1991, SIGIR '91.

[9]  David L. Waltz,et al.  Statistical methods, artificial intelligence, and information retrieval , 1992 .

[10]  Chris Buckley,et al.  Optimizing Document Indexing and Search Term Weighting Based on Probabilistic Models , 1992, TREC.

[11]  Efthimis N. Efthimiadis,et al.  UCLA-Okapi at TREC-2: Query Expansion Experiments , 1993, TREC.

[12]  Norbert Fuhr,et al.  Models for retrieval with probabilistic indexing , 1989, Inf. Process. Manag..

[13]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[14]  James Allan,et al.  Automatic Retrieval With Locality Information Using SMART , 1992, TREC.

[15]  James Allan,et al.  The effect of adding relevance information in a relevance feedback environment , 1994, SIGIR '94.

[16]  W. Bruce Croft,et al.  TREC-2 Routing and Ad-Hoc Retrieval Evaluation using the INQUERY System , 1993, TREC.

[17]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[18]  G Salton,et al.  Developments in Automatic Text Retrieval , 1991, Science.