论文信息 - An Unbiased Generative Model for Setting Dissemination Thresholds

An Unbiased Generative Model for Setting Dissemination Thresholds

Information filtering systems based on statistical retrieval models usually compute a numeric score that indicates how well each document matches each profile. Documents with scores above profile-specific dissemination thresholds are delivered. Optimal dissemination thresholds are usually difficult to determine a priori, so they are often learned during filtering, using relevance feedback about disseminated documents. However, the scores of disseminated documents are a biased sample of the complete distribution of document scores, which causes some algorithms to learn suboptimal thresholds.

Yi Zhang | Jamie Callan

[1] Stephen E. Robertson,et al. Threshold setting in adaptive filtering , 2000, J. Documentation.

[2] R. Manmatha,et al. Modeling score distributions for combining the outputs of search engines , 2001, SIGIR '01.

[3] Yi Zhang,et al. The Bias Problem and Language Models in Adaptive Filtering , 2001, TREC.

[4] Chris Buckley,et al. OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[5] William H. Press,et al. The Art of Scientific Computing Second Edition , 1998 .

[6] W. Bruce Croft,et al. Document Retrieval and Routing Using the INQUERY System , 1994, TREC.

[7] Avi Arampatzis,et al. Unbiased S-D Threshold Optimization, Initial Query Degradation, Decay, and Incrementality, for Adaptive Document Filtering , 2001, TREC.

[8] Yi Zhang,et al. YFilter at TREC-9 , 2000, TREC.

[9] Yi Zhang,et al. Maximum likelihood estimation for filtering thresholds , 2001, SIGIR '01.

[10] Stephen E. Robertson,et al. The TREC-8 Filtering Track Final Report , 1999, TREC.

[11] Yoram Singer,et al. Boosting and Rocchio applied to text filtering , 1998, SIGIR '98.