Neural Query Performance Prediction using Weak Supervision from Multiple Signals

Predicting the performance of a search engine for a given query is a fundamental and challenging task in information retrieval. Accurate performance predictors can be used in various ways, such as triggering an action, choosing the most effective ranking function per query, or selecting the best variant from multiple query formulations. In this paper, we propose a general end-to-end query performance prediction framework based on neural networks, called NeuralQPP. Our framework consists of multiple components, each learning a representation suitable for performance prediction. These representations are then aggregated and fed into a prediction sub-network. We train our models with multiple weak supervision signals, which is an unsupervised learning approach that uses the existing unsupervised performance predictors using weak labels. We also propose a simple yet effective component dropout technique to regularize our model. Our experiments on four newswire and web collections demonstrate that NeuralQPP significantly outperforms state-of-the-art baselines, in nearly every case. Furthermore, we thoroughly analyze the effectiveness of each component, each weak supervision signal, and all resulting combinations in our experiments.

[1]  Peter Bailey,et al.  Tasks, Queries, and Rankers in Pre-Retrieval Performance Prediction , 2017, ADCS.

[2]  Haggai Roitman,et al.  Enhanced Mean Retrieval Score Estimation for Query Performance Prediction , 2017, ICTIR.

[3]  Josiane Mothe,et al.  Linguistic features to predict query difficulty , 2005, SIGIR 2005.

[4]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[5]  Oren Kurland,et al.  Predicting Query Performance by Query-Drift Estimation , 2009, TOIS.

[6]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[7]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[8]  John D. Lafferty,et al.  Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[9]  Oren Kurland,et al.  Query-performance prediction: setting the expectations straight , 2014, SIGIR.

[10]  Jimmy J. Lin,et al.  Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems , 2016, ADCS.

[11]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[12]  Nello Cristianini,et al.  Estimating the Sentence-Level Quality of Machine Translation Systems , 2009, EAMT.

[13]  Djoerd Hiemstra,et al.  A survey of pre-retrieval query performance predictors , 2008, CIKM '08.

[14]  Oren Kurland,et al.  Using statistical decision theory and relevance models for query-performance prediction , 2010, SIGIR.

[15]  M. de Rijke,et al.  Using Coherence-Based Measures to Predict Query Difficulty , 2008, ECIR.

[16]  Elad Yom-Tov,et al.  Estimating the query difficulty for information retrieval , 2010, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[17]  Yiqun Liu,et al.  Training Deep Ranking Model with Weak Relevance Labels , 2017, ADC.

[18]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[19]  Jimmy J. Lin,et al.  Pseudo test collections for learning web search ranking functions , 2011, SIGIR.

[20]  Haggai Roitman,et al.  Robust Standard Deviation Estimation for Query Performance Prediction , 2017, ICTIR.

[21]  Bhaskar Mitra,et al.  Neural Ranking Models with Multiple Document Fields , 2017, WSDM.

[22]  Charles L. A. Clarke,et al.  Efficient and effective spam filtering and re-ranking for large web datasets , 2010, Information Retrieval.

[23]  Alistair Moffat,et al.  The Effect of Pooling and Evaluation Depth on Metric Stability , 2010, EVIA@NTCIR.

[24]  W. Bruce Croft,et al.  Ranking robustness: a novel framework to predict query performance , 2006, CIKM '06.

[25]  J. Shane Culpepper,et al.  The effect of pooling and evaluation depth on IR metrics , 2016, Information Retrieval Journal.

[26]  Charles L. A. Clarke,et al.  Overview of the TREC 2012 Web Track , 2012, TREC.

[27]  W. Bruce Croft,et al.  Precision prediction based on ranked list coherence , 2006, Information Retrieval.

[28]  W. Bruce Croft,et al.  Relevance-based Word Embedding , 2017, SIGIR.

[29]  Lourdes Araujo,et al.  Standard Deviation as a Query Hardness Estimator , 2010, SPIRE.

[30]  Haggai Roitman An Enhanced Approach to Query Performance Prediction Using Reference Lists , 2017, SIGIR.

[31]  Oren Kurland,et al.  A Unified Framework for Post-Retrieval Query-Performance Prediction , 2011, ICTIR.

[32]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[33]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[34]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[35]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[36]  Yelong Shen,et al.  Deep Context Modeling for Web Query Entity Disambiguation , 2017, CIKM.

[37]  W. Bruce Croft,et al.  Neural Ranking Models with Weak Supervision , 2017, SIGIR.

[38]  Christopher C. Yang Search Engines Information Retrieval in Practice , 2010, J. Assoc. Inf. Sci. Technol..

[39]  J. Shane Culpepper,et al.  Query Driven Algorithm Selection in Early Stage Retrieval , 2018, WSDM.

[40]  José Guilherme Camargo de Souza,et al.  Quality Estimation for Automatic Speech Recognition , 2014, COLING.

[41]  Azadeh Shakery,et al.  Pseudo-Relevance Feedback Based on Matrix Factorization , 2016, CIKM.

[42]  James Allan,et al.  Universal Approximation Functions for Fast Learning to Rank: Replacing Expensive Regression Forests with Simple Feed-Forward Networks , 2018, SIGIR.

[43]  Javed A. Aslam,et al.  Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions , 2007, ECIR.

[44]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[45]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[46]  Fernando Diaz,et al.  Performance prediction using spatial autocorrelation , 2007, SIGIR.

[47]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[48]  Shengli Wu,et al.  Query Performance Prediction By Considering Score Magnitude and Variance Together , 2014, CIKM.

[49]  Ingemar J. Cox,et al.  On ranking the effectiveness of searches , 2006, SIGIR.

[50]  Fernando Diaz,et al.  SIGIR 2018 Workshop on Learning from Limited or Noisy Data for Information Retrieval , 2018, SIGIR.

[51]  W. Bruce Croft,et al.  Query performance prediction in web search environments , 2007, SIGIR.

[52]  Oren Kurland,et al.  Query Performance Prediction Using Reference Lists , 2016, ACM Trans. Inf. Syst..

[53]  Joemon M. Jose,et al.  Improved query performance prediction using standard deviation , 2011, SIGIR.

[54]  Hamed Zamani,et al.  Multitask Learning for Adaptive Quality Estimation of Automatically Transcribed Utterances , 2015, NAACL.

[55]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[56]  W. Bruce Croft,et al.  Search Engines - Information Retrieval in Practice , 2009 .

[57]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[58]  Jaap Kamps,et al.  Avoiding Your Teacher's Mistakes: Training Neural Networks with Controlled Weak Supervision , 2017, ArXiv.

[59]  M. de Rijke,et al.  Building simulated queries for known-item topics: an analysis using six european languages , 2007, SIGIR.

[60]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.