论文信息 - Capreolus: A Toolkit for End-to-End Neural Ad Hoc Retrieval - 字舞流文

Capreolus: A Toolkit for End-to-End Neural Ad Hoc Retrieval

We present Capreolus, a toolkit designed to facilitate end-to-end it ad hoc retrieval experiments with neural networks by providing implementations of prominent neural ranking models within a common framework. Our toolkit adopts a standard reranking architecture via tight integration with the Anserini toolkit for candidate document generation using standard bag-of-words approaches. Using Capreolus, we are able to reproduce Yang et al.'s recent SIGIR 2019 finding that, in a reranking scenario on the test collection from the TREC 2004 Robust Track, many neural retrieval models do not significantly outperform a strong query expansion baseline. Furthermore, we find that this holds true for five additional models implemented in Capreolus. We describe the architecture and design of our toolkit, which includes a Web interface to facilitate comparisons between rankings returned by different models.

Jimmy J. Lin | Jimmy Lin | Xinyu Zhang | Andrew Yates | Wei Yang | Siddhant Arora | Kevin Martin Jose | Siddhant Arora | Andrew Yates | Xinyu Zhang | Wei Yang | Xinyu Crystina Zhang

[1] Zhiyuan Liu,et al. End-to-End Neural Ad-hoc Ranking with Kernel Pooling , 2017, SIGIR.

[2] Nazli Goharian,et al. CEDR: Contextualized Embeddings for Document Ranking , 2019, SIGIR.

[3] Nick Craswell,et al. Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[4] Larry P. Heck,et al. Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[5] Yelong Shen,et al. A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[6] Grace Hui Yang,et al. DeepTileBars: Visualizing Term Distribution for Neural Information Retrieval , 2019, AAAI.

[7] Jimmy J. Lin,et al. Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models , 2019, SIGIR.

[8] W. Bruce Croft,et al. A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[9] Ion Androutsopoulos,et al. Deep Relevance Ranking Using Enhanced Document-Query Interactions , 2018, EMNLP.

[10] Jamie Callan,et al. Deeper Text Understanding for IR with Contextual Neural Language Modeling , 2019, SIGIR.

[11] Jimmy J. Lin,et al. Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval , 2019, EMNLP.

[12] Jimmy J. Lin,et al. Simple Applications of BERT for Ad Hoc Document Retrieval , 2019, ArXiv.

[13] Jimmy J. Lin,et al. Anserini , 2018, Journal of Data and Information Quality.

[14] Zhiyuan Liu,et al. Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search , 2018, WSDM.

[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16] W. Bruce Croft,et al. Indri : A language-model based search engine for complex queries ( extended version ) , 2005 .

[17] Jun Xu,et al. Modeling Diverse Relevance Patterns in Ad-hoc Retrieval , 2018, SIGIR.

[18] Craig MacDonald,et al. From Puppy to Maturity: Experiences in Developing Terrier , 2012, OSIR@SIGIR.

[19] Gerard de Melo,et al. PACRR: A Position-Aware Neural IR Model for Relevance Matching , 2017, EMNLP.

[20] Xiang Ji,et al. MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching , 2019, SIGIR.