An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures

We explore different approaches to integrating a simple convolutional neural network (CNN) with the Lucene search engine in a multi-stage ranking architecture. Our models are trained using the PyTorch deep learning toolkit, which is implemented in C/C++ with a Python frontend. One obvious integration strategy is to expose the neural network directly as a service. For this, we use Apache Thrift, a software framework for building scalable cross-language services. In exploring alternative architectures, we observe that once trained, the feedforward evaluation of neural networks is quite straightforward. Therefore, we can extract the parameters of a trained CNN from PyTorch and import the model into Java, taking advantage of the Java Deeplearning4J library for feedforward evaluation. This has the advantage that the entire end-to-end system can be implemented in Java. As a third approach, we can extract the neural network from PyTorch and "compile" it into a C++ program that exposes a Thrift service. We evaluate these alternatives in terms of performance (latency and throughput) as well as ease of integration. Experiments show that feedforward evaluation of the convolutional neural network is significantly slower in Java, while the performance of the compiled C++ network does not consistently beat the PyTorch implementation.

[1]  Bhaskar Mitra,et al.  Neural Models for Information Retrieval , 2017, ArXiv.

[2]  W. Bruce Croft,et al.  Indri at TREC 2004: Terabyte Track , 2004, TREC.

[3]  Jimmy J. Lin,et al.  Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems , 2016, ADCS.

[4]  M. de Rijke,et al.  Pyndri: A Python Interface to the Indri Search Engine , 2017, ECIR.

[5]  Jimmy J. Lin,et al.  Quantitative evaluation of passage retrieval algorithms for question answering , 2003, SIGIR.

[6]  Chris Callison-Burch,et al.  Answer Extraction as Sequence Tagging with Tree Edit Distance , 2013, NAACL.

[7]  Guido Zuccon,et al.  The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017 , 2017, SIGIR.

[8]  W. Bruce Croft,et al.  Combining the language model and inference network approaches to retrieval , 2004, Inf. Process. Manag..

[9]  Noah A. Smith,et al.  What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA , 2007, EMNLP.

[10]  Jimmy J. Lin,et al.  Experiments with Convolutional Neural Network Models for Answer Selection , 2017, SIGIR.

[11]  Jimmy J. Lin,et al.  Effectiveness/efficiency tradeoffs for candidate generation in multi-stage retrieval architectures , 2013, SIGIR.

[12]  Bhaskar Mitra,et al.  Luandri: A Clean Lua Interface to the Indri Search Engine , 2017, SIGIR.

[13]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[14]  Juan M. Fernández-Luna,et al.  Lucene4IR: Developing Information Retrieval Evaluation Resources using Lucene , 2017, SIGIR Forum.

[15]  Jimmy J. Lin,et al.  A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[16]  Ben He,et al.  Terrier : A High Performance and Scalable Information Retrieval Platform , 2022 .

[17]  Craig MacDonald,et al.  From Puppy to Maturity: Experiences in Developing Terrier , 2012, OSIR@SIGIR.

[18]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[19]  Jimmy J. Lin,et al.  Anserini: Enabling the Use of Lucene for Information Retrieval Research , 2017, SIGIR.

[20]  Ulrich Rüde,et al.  High performance smart expression template math libraries , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).

[21]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[22]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Evaluation , 2000, TREC.