论文信息 - Selective Term Proximity Scoring Via BP-ANN - 字舞流文

Selective Term Proximity Scoring Via BP-ANN

When two terms occur together in a document, the probability of a close relationship between them and the document itself is greater if they are in nearby positions. However, ranking functions including term proximity (TP) require larger indexes than traditional document-level indexing, which slows down query processing. Previous studies also show that this technique is not effective for all types of queries. Here we propose a document ranking model which decides for which queries it would be beneficial to use a proximity-based ranking, based on a collection of features of the query. We use a machine learning approach in determining whether utilizing TP will be beneficial. Experiments show that the proposed model returns improved rankings while also reducing the overhead incurred as a result of using TP statistics.

Gang Wang | Xiaoguang Liu | Rebecca J. Stones | Ju Yang | Jiancong Tong | Benjun Ye | Zhaohua Zhang

[1] Panayiotis Bozanis,et al. Improved retrieval effectiveness by efficient combination of term proximity and zone scoring: A simulation-based evaluation , 2012, Simul. Model. Pract. Theory.

[2] J. Shane Culpepper,et al. How Effective are Proximity Scores in Term Dependency Models? , 2014, ADCS '14.

[3] Ying Wang,et al. A study of the effect of term proximity on query expansion , 2006, J. Inf. Sci..

[4] Charles L. A. Clarke,et al. Term proximity scoring for ad-hoc retrieval on very large text collections , 2006, SIGIR.

[5] Krysta Marie Svore,et al. How good is a span of terms?: exploiting proximity to improve web retrieval , 2010, SIGIR.

[6] Torsten Suel,et al. Inverted index compression and query processing with optimized document ordering , 2009, WWW '09.

[7] Jacques Savoy,et al. Term Proximity Scoring for Keyword-Based Retrieval Systems , 2003, ECIR.

[8] Tao Tao,et al. An exploration of proximity measures in information retrieval , 2007, SIGIR.

[9] W. Bruce Croft,et al. A Comparison of Retrieval Models using Term Dependencies , 2014, CIKM.

[10] Jianfeng Gao,et al. Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations , 2002, SIGIR '02.

[11] Yong Yu,et al. Viewing Term Proximity from a Different Perspective , 2008, ECIR.

[12] Fan Zhang,et al. Efficient term proximity search with term-pair indexes , 2010, CIKM.

[13] Leif Azzopardi. Query side evaluation: an empirical analysis of effectiveness and effort , 2009, SIGIR.

[14] Shuming Shi,et al. Effective top-k computation with term-proximity support , 2009, Inf. Process. Manag..

[15] ChengXiang Zhai,et al. Positional language models for information retrieval , 2009, SIGIR.

[16] Torsten Suel,et al. A candidate filtering mechanism for fast top-k query processing on modern cpus , 2013, SIGIR.

[17] Jimmy J. Lin,et al. A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[18] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[19] W. Bruce Croft,et al. Learning concept importance using a weighted dependence model , 2010, WSDM '10.

[20] Stephen E. Robertson,et al. Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[21] Deniz Yuret,et al. Discovery of linguistic relations using lexical attraction , 1998, ArXiv.

[22] Ronan Cummins,et al. Learning in a pairwise term-term proximity framework for information retrieval , 2009, SIGIR.

[23] Tao Qin,et al. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[24] W. Bruce Croft,et al. A Markov random field model for term dependencies , 2005, SIGIR '05.