论文信息 - A unified model for metasearch, pooling, and system evaluation

A unified model for metasearch, pooling, and system evaluation

We present a unified model which, given the ranked lists of documents returned by multiple retrieval systems in response to a given query, simultaneously solves the problems of (1) fusing the ranked lists of documents in order to obtain a high-quality combined list (metasearch); (2) generating document collections likely to contain large fractions of relevant documents (pooling); and (3) accurately evaluating the underlying retrieval systems with small numbers of relevance judgments (efficient system assessment). Our approach is based on the Hedge algorithm for on-line learning. In effect, our proposed system "learns" which documents are likely to be relevant from a sequence of on-line relevance judgments. In experiments using TREC data, our methodology is shown to outperform standard methods for metasearch, pooling, and system evaluation, often remarkably so.

[1] Susan T. Dumais,et al. Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval , 2004, SIGIR 2004.

[2] D. K. Harmon,et al. Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[3] Donna K. Harman,et al. Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[4] Charles L. A. Clarke,et al. Efficient construction of large test collections , 1998, SIGIR '98.

[5] R. Manmatha,et al. Modeling score distributions for combining the outputs of search engines , 2001, SIGIR '01.

[6] Javed A. Aslam,et al. Condorcet fusion for improved retrieval , 2002, CIKM '02.

[7] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[8] Edward A. Fox,et al. Combination of Multiple Searches , 1993, TREC.

[9] Jong-Hak Lee,et al. Analyses of multiple evidence combination , 1997, SIGIR '97.

[10] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..

[11] Garrison W. Cottrell,et al. Automatic combination of multiple ranked retrieval systems , 1994, SIGIR '94.

[12] Justin Zobel,et al. How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[13] Donald H. Kraft,et al. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval , 1998, SIGIR 2002.

[14] Christopher C. Vogt. How much more is better? Characterising the effects of adding more IR Systems to a combination , 2000, RIAO.

[15] Javed A. Aslam,et al. Models for metasearch , 2001, SIGIR '01.

[16] Joon Ho Lee,et al. Combining multiple evidence from different properties of weighting schemes , 1995, SIGIR '95.

[17] Donna Harman,et al. The Second Text Retrieval Conference (TREC-2) , 1995, Inf. Process. Manag..