Contrastive Explanations with Local Foil Trees

Recent advances in interpretable Machine Learning (iML) and eXplainable AI (XAI) construct explanations based on the importance of features in classification tasks. However, in a high-dimensional feature space this approach may become unfeasible without restraining the set of important features. We propose to utilize the human tendency to ask questions like "Why this output (the fact) instead of that output (the foil)?" to reduce the number of features to those that play a main role in the asked contrast. Our proposed method utilizes locally trained one-versus-all decision trees to identify the disjoint set of rules that causes the tree to classify data points as the foil and not as the fact. In this study we illustrate this approach on three benchmark classification tasks.

[1]  Alun D. Preece,et al.  Interpretability of deep learning models: A survey of results , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[2]  T. Lombrozo,et al.  Ockham’s Razor Cuts to the Root: Simplicity in Causal Explanation , 2017, Journal of experimental psychology. General.

[3]  Weng-Keen Wong,et al.  Why-oriented end-user debugging of naive Bayes text classification , 2011, ACM Trans. Interact. Intell. Syst..

[4]  Trevor Darrell,et al.  Generating Visual Explanations , 2016, ECCV.

[5]  Suresh Venkatasubramanian,et al.  A comparative study of fairness-enhancing interventions in machine learning , 2018, FAT.

[6]  Giles Hooker,et al.  Interpreting Models via Single Tree Approximation , 2016, 1610.09036.

[7]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[8]  Scott Lundberg,et al.  An unexpected unity among methods for interpreting model predictions , 2016, ArXiv.

[9]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[10]  K. Crawford Artificial Intelligence's White Guy Problem , 2016 .

[11]  Karthikeyan Natesan Ramamurthy,et al.  TreeView: Peeking into Deep Neural Networks Via Feature-Space Partitioning , 2016, ArXiv.

[12]  Cary Coglianese,et al.  Regulating by Robot: Administrative Decision Making in the Machine-Learning Era , 2017 .

[13]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[14]  Ertugrul Basar,et al.  Space-Time Channel Modulation , 2017, IEEE Transactions on Vehicular Technology.

[15]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[16]  Mark O. Riedl,et al.  Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations , 2017, AIES.

[17]  Tim Miller,et al.  Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences , 2017, ArXiv.

[18]  R. Krishnan,et al.  Extracting decision trees from trained neural networks , 1999, Pattern Recognit..

[19]  Piyush Gupta,et al.  MAGIX: Model Agnostic Globally Interpretable Explanations , 2017, ArXiv.

[20]  Bart Baesens,et al.  An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[21]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[22]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[23]  Kush R. Varshney,et al.  Learning Interpretable Classification Rules with Boolean Compressed Sensing , 2017 .

[24]  Weng-Keen Wong,et al.  Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.

[25]  Amit Dhurandhar,et al.  Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.

[26]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[27]  Bernease Herman,et al.  The Promise and Peril of Human Evaluation for Model Interpretability , 2017, ArXiv.

[28]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[29]  Regina Barzilay,et al.  Rationalizing Neural Predictions , 2016, EMNLP.

[30]  Cynthia Rudin,et al.  Bayesian Rule Sets for Interpretable Classification , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[31]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[32]  Thomas A. Runkler,et al.  Interpretable Policies for Reinforcement Learning by Genetic Programming , 2017, Eng. Appl. Artif. Intell..

[33]  Alexandros G. Dimakis,et al.  Streaming Weak Submodularity: Interpreting Neural Networks on the Fly , 2017, NIPS.

[34]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[35]  Mark Craven,et al.  Rule Extraction: Where Do We Go from Here? , 1999 .