Investigating Weak Supervision in Deep Ranking

Abstract A number of deep neural networks have been proposed to improve the performance of document ranking in information retrieval studies. However, the training processes of these models usually need a large scale of labeled data, leading to data shortage becoming a major hindrance to the improvement of neural ranking models’ performances. Recently, several weakly supervised methods have been proposed to address this challenge with the help of heuristics or users’ interaction in the Search Engine Result Pages (SERPs) to generate weak relevance labels. In this work, we adopt two kinds of weakly supervised relevance, BM25-based relevance and click model-based relevance, and make a deep investigation into their differences in the training of neural ranking models. Experimental results show that BM25-based relevance helps models capture more exact matching signals, while click model-based relevance enhances the rankings of documents that may be preferred by users. We further proposed a cascade ranking framework to combine the two weakly supervised relevance, which significantly promotes the ranking performance of neural ranking models and outperforms the best result in the last NTCIR-13 We Want Web (WWW) task. This work reveals the potential of constructing better document retrieval systems based on multiple kinds of weak relevance signals.

[1]  W. Bruce Croft,et al.  Learning a Deep Listwise Context Model for Ranking Refinement , 2018, SIGIR.

[2]  M. de Rijke,et al.  Click Models for Web Search , 2015, Click Models for Web Search.

[3]  Ciya Liao,et al.  A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine , 2010, WSDM '10.

[4]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[5]  Yiqun Liu,et al.  SogouT-16: A New Web Corpus to Embrace IR Research , 2017, SIGIR.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Xueqi Cheng,et al.  A Deep Investigation of Deep IR Models , 2017, ArXiv.

[8]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[9]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[10]  Zhiyuan Liu,et al.  End-to-End Neural Ad-hoc Ranking with Kernel Pooling , 2017, SIGIR.

[11]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[12]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[13]  W. Bruce Croft,et al.  Quary Expansion Using Local and Global Document Analysis , 1996, SIGIR Forum.

[14]  Thorsten Joachims,et al.  Accurately Interpreting Clickthrough Data as Implicit Feedback , 2017 .

[15]  Xueqi Cheng,et al.  DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval , 2017, CIKM.

[16]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[17]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[18]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[19]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[20]  W. Bruce Croft,et al.  Neural Ranking Models with Weak Supervision , 2017, SIGIR.

[21]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[22]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[23]  Yiqun Liu,et al.  Sogou-QCL: A New Dataset with Click Relevance Label , 2018, SIGIR.

[24]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25]  Xueqi Cheng,et al.  MatchZoo: A Toolkit for Deep Text Matching , 2017, ArXiv.

[26]  Yuchen Zhang,et al.  User-click modeling for understanding and predicting search-behavior , 2011, KDD.

[27]  Berkant Barla Cambazoglu,et al.  Scalability Challenges in Web Search Engines , 2015, Scalability Challenges in Web Search Engines.

[28]  GayGeri,et al.  Accurately Interpreting Clickthrough Data as Implicit Feedback , 2017 .

[29]  Yiqun Liu,et al.  Incorporating vertical results into search click models , 2013, SIGIR.

[30]  Andrew Yates,et al.  An Approach for Weakly-Supervised Deep Information Retrieval , 2017, ArXiv.

[31]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[32]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[33]  Cheng Luo,et al.  Overview of the NTCIR-13 We Want Web Task , 2017, NTCIR.

[34]  Tao Qin,et al.  Introducing LETOR 4.0 Datasets , 2013, ArXiv.

[35]  Hongbo Deng,et al.  Ranking Relevance in Yahoo Search , 2016, KDD.

[36]  Yiqun Liu,et al.  Incorporating Non-sequential Behavior into Click Models , 2015, SIGIR.

[37]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[38]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[39]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[41]  Yiqun Liu,et al.  Time-Aware Click Model , 2016, ACM Trans. Inf. Syst..