Risk-Sensitive Deep Neural Learning to Rank

Learning to Rank (L2R) is the core task of many Information Retrieval systems. Recently, a great effort has been put on exploring Deep Neural Networks (DNNs) for L2R, with significant results. However, risk-sensitiveness, an important and recent advance in the L2R arena, that reduces variability and increases trust, has not been incorporated into Deep Neural L2R yet. Risk-sensitive measures are important to assess the risk of an IR system to perform worse than a set of baseline IR systems for several queries. However, the risk-sensitive measures described in the literature have a non-smooth behavior, making them difficult, if not impossible, to be optimized by DNNs. In this work we solve this difficult problem by proposing a family of new loss functions -- \riskloss\ -- that support a smooth risk-sensitive optimization. \riskloss\ introduces two important contributions: (i) the substitution of the traditional NDCG or MAP metrics in risk-sensitive measures with smooth loss functions that evaluate the correlation between the predicted and the true relevance order of documents for a given query and (ii) the use of distinct versions of the same DNN architecture as baselines by means of a multi-dropout technique during the smooth risk-sensitive optimization, avoiding the inconvenience of assessing multiple IR systems as part of DNN training. We empirically demonstrate significant achievements of the proposed \riskloss\ functions when used with recent DNN methods in the context of well-known web-search datasets such as WEB10K, YAHOO, and MQ2007. Our solutions reach improvements of 8% in effectiveness (NDCG) while improving in around 5% the risk-sensitiveness (\grisk\ measure) when applied together with a state-of-the-art Self-Attention DNN-L2R architecture. Furthermore, \riskloss\ is capable of reducing by 28% the losses over the best evaluated baselines and significantly improving over the risk-sensitive state-of-the-art non-DNN method (by up to 13.3%) while keeping (or even increasing) overall effectiveness. All these results ultimately establish a new level for the state-of-the-art on risk-sensitiveness and DNN-L2R research.

[1]  Md Zia Ullah,et al.  Defining an Optimal Configuration Set for Selective Search Strategy - A Risk-Sensitive Approach , 2021, CIKM.

[2]  Iadh Ounis,et al.  A Contextual Recurrent Collaborative Filtering framework for modelling sequences of venue checkins , 2020, Inf. Process. Manag..

[3]  J. Shane Culpepper,et al.  Bayesian Inferential Risk Evaluation On Multiple IR Systems , 2020, SIGIR.

[4]  Radoslaw Bialobrzeski,et al.  Context-Aware Learning to Rank with Self-Attention , 2020, ArXiv.

[5]  Kevin Duh,et al.  Modeling Document Interactions for Learning to Rank with Regularized Self-Attention , 2020, ArXiv.

[6]  Mathieu Blondel,et al.  Fast Differentiable Sorting and Ranking , 2020, ICML.

[7]  Diego Klabjan,et al.  Listwise Learning to Rank by Exploring Unique Ratings , 2020, WSDM.

[8]  Prabha Rajagopal,et al.  Exploring Topic Difficulty in Information Retrieval Systems Evaluation , 2019, Journal of Physics: Conference Series.

[9]  Sebastian Bruch,et al.  An Alternative Cross Entropy Loss for Learning-to-Rank , 2019, WWW.

[10]  Xiaoshuang Shi,et al.  Fully automatic knee osteoarthritis severity grading using deep neural networks with a novel ordinal loss , 2019, Comput. Medical Imaging Graph..

[11]  Hiroshi Inoue,et al.  Multi-Sample Dropout for Accelerated Training and Better Generalization , 2019, ArXiv.

[12]  Thierson Couto,et al.  Risk-Sensitive Learning to Rank with Evolutionary Multi-Objective Feature Selection , 2019, ACM Trans. Inf. Syst..

[13]  Cheng Li,et al.  The LambdaLoss Framework for Ranking Metric Optimization , 2018, CIKM.

[14]  Elad Eban,et al.  Seq2Slate: Re-ranking and Slate Optimization with RNNs , 2018, ArXiv.

[15]  W. Bruce Croft,et al.  Learning a Deep Listwise Context Model for Ranking Refinement , 2018, SIGIR.

[16]  Xueqi Cheng,et al.  DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval , 2017, CIKM.

[17]  Thierson Couto,et al.  Incorporating Risk-Sensitiveness into Feature Selection for Learning to Rank , 2016, CIKM.

[18]  Craig MacDonald,et al.  Risk-Sensitive Evaluation and Learning to Rank using Multiple Baselines , 2016, SIGIR.

[19]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[20]  Jun Wang,et al.  Generalized Bias-Variance Evaluation of TREC Participated Systems , 2014, CIKM.

[21]  Craig MacDonald,et al.  Hypothesis testing for the risk-sensitive evaluation of retrieval systems , 2014, SIGIR.

[22]  Tao Qin,et al.  Introducing LETOR 4.0 Datasets , 2013, ArXiv.

[23]  Bart P. Knijnenburg,et al.  Explaining the user experience of recommender systems , 2012, User Modeling and User-Adapted Interaction.

[24]  Paul N. Bennett,et al.  Robust ranking models via risk-sensitive optimization , 2012, SIGIR '12.

[25]  Tao Qin,et al.  A general approximation framework for direct optimization of information retrieval measures , 2010, Information Retrieval.

[26]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[27]  Tao Qin,et al.  Query-level loss functions for information retrieval , 2008, Inf. Process. Manag..

[28]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[29]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[30]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.

[31]  Yi Tay,et al.  Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees? , 2021, ICLR.

[32]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..