Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives

In this paper we propose a novel method that provides contrastive explanations justifying the classification of an input by a black box classifier such as a deep neural network. Given an input we find what should be %necessarily and minimally and sufficiently present (viz. important object pixels in an image) to justify its classification and analogously what should be minimally and necessarily \emph{absent} (viz. certain background pixels). We argue that such explanations are natural for humans and are used commonly in domains such as health care and criminology. What is minimally but critically \emph{absent} is an important part of an explanation, which to the best of our knowledge, has not been explicitly identified by current explanation methods that explain predictions of neural networks. We validate our approach on three real datasets obtained from diverse domains; namely, a handwritten digits dataset MNIST, a large procurement fraud dataset and a brain activity strength dataset. In all three cases, we witness the power of our approach in generating precise explanations that are also easy for human experts to understand and evaluate.

[1]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[2]  Rishabh Singh,et al.  Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections , 2018, NeurIPS.

[3]  Jason Yosinski,et al.  Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks , 2016, ArXiv.

[4]  Daniel P. Kennedy,et al.  The Autism Brain Imaging Data Exchange: Towards Large-Scale Evaluation of the Intrinsic Brain Architecture in Autism , 2013, Molecular Psychiatry.

[5]  Richard G. Baraniuk,et al.  DeepCodec: Adaptive sensing and recovery via deep convolutional neural networks , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[6]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[7]  Kush R. Varshney,et al.  Interpretable Two-level Boolean Rule Learning for Classification , 2015, ArXiv.

[8]  Yoshua Bengio,et al.  Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Cynthia Rudin,et al.  Falling Rule Lists , 2014, AISTATS.

[10]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[11]  Amit Dhurandhar,et al.  TIP: Typifying the Interpretability of Procedures , 2017, ArXiv.

[12]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[13]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[14]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[15]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[16]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[17]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Jinfeng Yi,et al.  EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples , 2017, AAAI.

[20]  David Weinberger,et al.  Accountability of AI Under the Law: The Role of Explanation , 2017, ArXiv.

[21]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[22]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[23]  Charu C. Aggarwal,et al.  Efficient Data Representation by Selecting Prototypes with Importance Weights , 2017, 2019 IEEE International Conference on Data Mining (ICDM).

[24]  R Cameron Craddock,et al.  A whole brain fMRI atlas generated via spatially constrained spectral clustering , 2012, Human brain mapping.

[25]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Karthik S. Gurumoorthy,et al.  ProtoDash: Fast Interpretable Prototype Selection , 2017, ArXiv.

[27]  Sonstiges Corruption Perceptions Index , 2011 .

[28]  Marisa O. Hollinshead,et al.  The organization of the human cerebral cortex estimated by intrinsic functional connectivity. , 2011, Journal of neurophysiology.

[29]  Jared A. Nielsen,et al.  Multisite functional connectivity MRI classification of autism: ABIDE results , 2013, Front. Hum. Neurosci..

[30]  Regina Barzilay,et al.  Rationalizing Neural Predictions , 2016, EMNLP.

[31]  Tinne Tuytelaars,et al.  Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks , 2017, ICLR.

[32]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[33]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Klaus-Robert Müller,et al.  Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[35]  Alexander Binder,et al.  The LRP Toolbox for Artificial Neural Networks , 2016, J. Mach. Learn. Res..

[36]  Andrei Irimia,et al.  Resting-State Functional Connectivity in Autism Spectrum Disorders: A Review , 2017, Front. Psychiatry.

[37]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[38]  Sameer Singh,et al.  Generating Natural Adversarial Examples , 2017, ICLR.

[39]  Jenna Reinen,et al.  Autism Classification Using Brain Functional Connectivity Dynamics and Machine Learning , 2017, 1712.08041.

[40]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[41]  A. Franco,et al.  Identification of autism spectrum disorder using deep learning and the ABIDE dataset , 2017, NeuroImage: Clinical.

[42]  Amit Dhurandhar,et al.  Supervised item response models for informative prediction , 2016, Knowledge and Information Systems.