Towards Automatic Concept-based Explanations

Interpretability has become an important topic of research as more machine learning (ML) models are deployed and widely used to make important decisions. Most of the current explanation methods provide explanations through feature importance scores, which identify features that are important for each individual input. However, how to systematically summarize and interpret such per sample feature importance scores itself is challenging. In this work, we propose principles and desiderata for \emph{concept} based explanation, which goes beyond per-sample features to identify higher-level human-understandable concepts that apply across the entire dataset. We develop a new algorithm, ACE, to automatically extract visual concepts. Our systematic experiments demonstrate that \alg discovers concepts that are human-meaningful, coherent and important for the neural network's predictions.

[1]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[2]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[3]  Stefano Soatto,et al.  Quick Shift and Kernel Methods for Mode Seeking , 2008, ECCV.

[4]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[7]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Peer Neubert,et al.  Compact Watershed and Preemptive SLIC: On Improving Trade-offs of Superpixel Segmentation Algorithms , 2014, 2014 22nd International Conference on Pattern Recognition.

[9]  Wei Liu,et al.  ParseNet: Looking Wider to See Better , 2015, ArXiv.

[10]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[11]  Bolei Zhou,et al.  Interpretable Basis Decomposition for Visual Explanation , 2018, ECCV.

[12]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[15]  Mayank Vatsa,et al.  Deep Dictionary Learning , 2016, IEEE Access.

[16]  Ronen Basri,et al.  Fast multiscale image segmentation , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[17]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[18]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[19]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Cynthia Rudin,et al.  Falling Rule Lists , 2014, AISTATS.

[21]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[22]  Tania Lombrozo,et al.  Concept possession, experimental semantics, and hybrid theories of reference , 2012 .

[23]  Daniel G. Goldstein,et al.  Manipulating and Measuring Model Interpretability , 2018, CHI.

[24]  Abubakar Abid,et al.  Interpretation of Neural Networks is Fragile , 2017, AAAI.

[25]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[26]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[27]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[28]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .

[29]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[30]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[31]  Ferran Marqués,et al.  Multiresolution Hierarchy Co-Clustering for Semantic Segmentation in Sequences with Small Variations , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Matthias Bethge,et al.  Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.

[34]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[35]  Jitendra Malik,et al.  Semantic segmentation using regions and parts , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[37]  Judith Masthoff,et al.  Designing and Evaluating Explanations for Recommender Systems , 2011, Recommender Systems Handbook.

[38]  Cynthia Rudin,et al.  Methods and Models for Interpretable Linear Classification , 2014, ArXiv.

[39]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[40]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[41]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[42]  Andrea Saltelli,et al.  Sensitivity Analysis for Importance Assessment , 2002, Risk analysis : an official publication of the Society for Risk Analysis.

[43]  James Y. Zou,et al.  Knockoffs for the mass: new feature importance statistics with false discovery guarantees , 2018, AISTATS.

[44]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[46]  Yihong Gong,et al.  Superpixel Hierarchy , 2016, IEEE Transactions on Image Processing.

[47]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[48]  Yarin Gal,et al.  Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.

[49]  Dumitru Erhan,et al.  The (Un)reliability of saliency methods , 2017, Explainable AI.