Provably efficient, succinct, and precise explanations

We consider the problem of explaining the predictions of an arbitrary blackbox model f : given query access to f and an instance x, output a small set of x’s features that in conjunction essentially determines f(x). We design an efficient algorithm with provable guarantees on the succinctness and precision of the explanations that it returns. Prior algorithms were either efficient but lacked such guarantees, or achieved such guarantees but were inefficient. We obtain our algorithm via a connection to the problem of implicitly learning decision trees. The implicit nature of this learning task allows for efficient algorithms even when the complexity of f necessitates an intractably large surrogate decision tree. We solve the implicit learning problem by bringing together techniques from learning theory, local computation algorithms, and complexity theory. Our approach of “explaining by implicit learning” shares elements of two previously disparate methods for post-hoc explanations, global and local explanations, and we make the case that it enjoys advantages of both.

[1]  Rocco A. Servedio,et al.  Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[2]  Leo Breiman,et al.  BORN AGAIN TREES , 1996 .

[3]  Daniel M. Kane The average sensitivity of an intersection of half spaces , 2014, STOC.

[4]  Hendrik Blockeel,et al.  Seeing the Forest Through the Trees: Learning a Comprehensible Model from an Ensemble , 2007, ECML.

[5]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[6]  Adnan Darwiche,et al.  On The Reasons Behind Decisions , 2020, ECAI.

[7]  Filip De Turck,et al.  A Genetic Algorithm for Interpretable Model Extraction from Decision Tree Ensembles , 2017, PAKDD.

[8]  Ryan O'Donnell,et al.  Polynomial regression under arbitrary product distributions , 2010, Machine Learning.

[9]  Kamalika Chaudhuri,et al.  Connecting Interpretability and Robustness in Decision Trees through Separation , 2021, ICML.

[10]  Osbert Bastani,et al.  Interpretability via Model Extraction , 2017, ArXiv.

[11]  Nina Narodytska,et al.  On Validating, Repairing and Refining Heuristic ML Explanations , 2019, ArXiv.

[12]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[13]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[14]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[15]  Alexey Ignatiev,et al.  Towards Trustable Explainable AI , 2020, IJCAI.

[16]  Maximilian Schiffer,et al.  Born-Again Tree Ensembles , 2020, ICML.

[17]  Joao Marques-Silva,et al.  Abduction-Based Explanations for Machine Learning Models , 2018, AAAI.

[18]  Ronitt Rubinfeld,et al.  Fast Local Computation Algorithms , 2011, ICS.

[19]  Sanjoy Dasgupta,et al.  A cost function for similarity-based hierarchical clustering , 2015, STOC.

[20]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[21]  Gábor Tardos,et al.  Query complexity, or why is it difficult to separateNPA∩coNPA fromPA by random oraclesA? , 1989, Comb..

[22]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[23]  Clifford D. Smyth Reimer's inequality and tardos' conjecture , 2002, STOC '02.

[24]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[25]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[26]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[27]  Guy Blanc,et al.  Top-down induction of decision trees: rigorous guarantees and inherent limitations , 2019, Electron. Colloquium Comput. Complex..

[28]  Prasad Raghavendra,et al.  Average Sensitivity and Noise Sensitivity of Polynomial Threshold Functions , 2009, SIAM J. Comput..

[29]  Jure Leskovec,et al.  Faithful and Customizable Explanations of Black Box Models , 2019, AIES.

[30]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[31]  Adnan Darwiche,et al.  A Symbolic Approach to Explaining Bayesian Network Classifiers , 2018, IJCAI.

[32]  Gitta Kutyniok,et al.  The Computational Complexity of Understanding Binary Classifier Decisions , 2021, J. Artif. Intell. Res..

[33]  Himabindu Lakkaraju,et al.  Robust and Stable Black Box Explanations , 2020, ICML.

[34]  Christopher Umans The minimum equivalent DNF problem and shortest implicants , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[35]  Guy Blanc,et al.  Testing and reconstruction via decision trees , 2020, ArXiv.

[36]  Giles Hooker,et al.  Interpreting Models via Single Tree Approximation , 2016, 1610.09036.

[37]  Guy Blanc,et al.  Universal guarantees for decision tree induction via a higher-order splitting criterion , 2020, NeurIPS.