Towards Axiomatic Explanations for Neural Ranking Models

Recently, neural networks have been successfully employed to improve upon state-of-the-art effectiveness in ad-hoc retrieval tasks via machine-learned ranking functions. While neural retrieval models grow in complexity and impact, little is understood about their correspondence with well-studied IR principles. Recent work on interpretability in machine learning has provided tools and techniques to understand neural models in general, yet there has been little progress towards explaining ranking models. We investigate whether one can explain the behavior of neural ranking models in terms of their congruence with well understood principles of document ranking by using established theories from axiomatic~IR. Axiomatic analysis of information retrieval models has formalized a set of constraints on ranking decisions that reasonable retrieval models should fulfill. We operationalize this axiomatic thinking to reproduce rankings based on combinations of elementary constraints. This allows us to investigate to what extent the ranking decisions of neural rankers can be explained in terms of the existing retrieval axioms, and which axioms apply in which situations. Our experimental study considers a comprehensive set of axioms over several representative neural rankers. While the existing axioms can already explain the particularly confident ranking decisions rather well, future work should extend the axiom set to also cover the other still "unexplainable" neural IR rank decisions.

[1]  Ronan Cummins,et al.  A constraint to automatically regulate document-length normalisation , 2012, CIKM '12.

[2]  Stefano Mizzaro,et al.  Axiometrics: An Axiomatic Approach to Information Retrieval Effectiveness Metrics , 2013, ICTIR.

[3]  W. Bruce Croft,et al.  Neural Ranking Models with Weak Supervision , 2017, SIGIR.

[4]  Azadeh Shakery,et al.  Axiomatic Analysis of Cross-Language Information Retrieval , 2014, CIKM.

[5]  Kyunghyun Cho,et al.  Passage Re-ranking with BERT , 2019, ArXiv.

[6]  Haitao Li,et al.  How to Count Thumb-Ups and Thumb-Downs: User-Rating Based Ranking of Items from an Axiomatic Perspective , 2011, ICTIR.

[7]  Hao Wu,et al.  Relation Based Term Weighting Regularization , 2012, ECIR.

[8]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Robust Track. , 2004 .

[9]  Zeon Trevor Fernando,et al.  An in-depth analysis of passage-level label transfer for contextual document ranking , 2021, Information Retrieval Journal.

[10]  Felipe Moraes,et al.  An Axiomatic Approach to Diagnosing Neural IR Models , 2019, ECIR.

[11]  Avishek Anand,et al.  Distant Supervision in BERT-based Adhoc Document Retrieval , 2020, CIKM.

[12]  Bhaskar Mitra,et al.  An Axiomatic Approach to Regularizing Neural Ranking Models , 2019, SIGIR.

[13]  Wei Zheng,et al.  Query Aspect Based Term Weighting Regularization in Information Retrieval , 2010, ECIR.

[14]  Tao Tao,et al.  An exploration of proximity measures in information retrieval , 2007, SIGIR.

[15]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[16]  Claudia Hauff,et al.  Diagnosing BERT with Retrieval Heuristics , 2020, ECIR.

[17]  Jimmy J. Lin,et al.  Anserini: Enabling the Use of Lucene for Information Retrieval Research , 2017, SIGIR.

[18]  Tao Tao,et al.  Diagnostic Evaluation of Information Retrieval Models , 2011, TOIS.

[19]  Jimmy J. Lin,et al.  The Impact of Score Ties on Repeatability in Document Ranking , 2019, SIGIR.

[20]  Gilles Louppe,et al.  Understanding variable importances in forests of randomized trees , 2013, NIPS.

[21]  Sreenivas Gollapudi,et al.  An axiomatic approach for result diversification , 2009, WWW '09.

[22]  ChengXiang Zhai,et al.  Semantic term matching in axiomatic approaches to information retrieval , 2006, SIGIR.

[23]  ChengXiang Zhai,et al.  A Log-Logistic Model-Based Interpretation of TF Normalization of BM25 , 2012, ECIR.

[24]  Daniel G. Shapiro,et al.  RUBRIC: A System for Rule-Based Information Retrieval , 1985, IEEE Transactions on Software Engineering.

[25]  ChengXiang Zhai,et al.  An exploration of axiomatic approaches to information retrieval , 2005, SIGIR '05.

[26]  Avishek Anand,et al.  Model agnostic interpretability of rankers via intent modelling , 2020, FAT*.

[27]  Moshe Tennenholtz,et al.  Ranking systems: the PageRank axioms , 2005, EC '05.

[28]  Bhaskar Mitra,et al.  Neural Text Embeddings for Information Retrieval , 2017, WSDM.

[29]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[30]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[31]  Xueqi Cheng,et al.  A Study of MatchPyramid Models on Ad-hoc Retrieval , 2016, ArXiv.

[32]  Zijian Zhang,et al.  Explain and Predict, and then Predict Again , 2021, WSDM.

[33]  Hui Fang,et al.  A Re-examination of Query Expansion Using Lexical Resources , 2008, ACL.

[34]  Xiang Ji,et al.  MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching , 2019, SIGIR.

[35]  ChengXiang Zhai,et al.  Axiomatic Analysis of Translation Language Model for Information Retrieval , 2012, ECIR.

[36]  Peter Bruza,et al.  Investigating aboutness axioms using information fields , 1994, SIGIR '94.

[37]  Bin Wang,et al.  An Axiomatic Approach to Exploit Term Dependencies in Language Model , 2008, AIRS.

[38]  C. J. van Rijsbergen,et al.  A New Theoretical Framework for Information Retrieval , 1986, SIGIR Forum.

[39]  Zachary C. Lipton,et al.  Troubling Trends in Machine Learning Scholarship , 2018, ACM Queue.

[40]  Jong-Hyeok Lee,et al.  Improving Term Frequency Normalization for Multi-topical Documents and Application to Language Modeling Approaches , 2008, ECIR.

[41]  Avishek Anand,et al.  A study on the Interpretability of Neural Retrieval Models using DeepSHAP , 2019, SIGIR.

[42]  Jimmy J. Lin,et al.  Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models , 2019, SIGIR.

[43]  Jamie Callan,et al.  Deeper Text Understanding for IR with Contextual Neural Language Modeling , 2019, SIGIR.

[44]  Nick Craswell,et al.  Duet at Trec 2019 Deep Learning Track , 2019, TREC.

[45]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[46]  Avishek Anand,et al.  Posthoc Interpretability of Learning to Rank Models using Secondary Training Data , 2018, ArXiv.

[47]  Matthias Hagen,et al.  Axiomatic Result Re-Ranking , 2016, CIKM.

[48]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[49]  ChengXiang Zhai,et al.  Lower-bounding term frequency normalization , 2011, CIKM '11.

[50]  Fabio Crestani,et al.  Score Transformation in Linear Combination for Multi-criteria Relevance Ranking , 2012, ECIR.

[51]  Ion Androutsopoulos,et al.  Deep Relevance Ranking Using Enhanced Document-Query Interactions , 2018, EMNLP.

[52]  Jimmy J. Lin,et al.  Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval , 2019, EMNLP.

[53]  Andrew Yates,et al.  Investigating Retrieval Method Selection with Axiomatic Features , 2019, AMIR@ECIR.

[54]  ChengXiang Zhai,et al.  Axiomatic Thinking for Information Retrieval: And Related Tasks , 2017, SIGIR.

[55]  Avishek Anand,et al.  EXS: Explainable Search Using Local Model Agnostic Interpretability , 2018, WSDM.

[56]  Umberto Straccia,et al.  A model of information retrieval based on a terminological logic , 1993, SIGIR.

[57]  Doug Downey,et al.  ABNIRML: Analyzing the Behavior of Neural IR Models , 2020, TACL.

[58]  Ronan Cummins,et al.  An axiomatic comparison of learned term-weighting schemes in information retrieval: clarifications and extensions , 2007, Artificial Intelligence Review.

[59]  Ronan Cummins,et al.  Analysing Ranking Functions in Information Retrieval Using Constraints , 2010 .

[60]  Julio Gonzalo,et al.  A general evaluation measure for document organization tasks , 2013, SIGIR.

[61]  Gerard de Melo,et al.  PACRR: A Position-Aware Neural IR Model for Relevance Matching , 2017, EMNLP.

[62]  Tao Tao,et al.  A formal study of information retrieval heuristics , 2004, SIGIR '04.