Review Study of Interpretation Methods for Future Interpretable Machine Learning

In recent years, black-box models have developed rapidly because of their high accuracy. Balancing the interpretability and accuracy is increasingly important. The lack of interpretability severely limits the application of the model in academia and industry. Despite the various interpretable machine learning methods, the perspective and meaning of the interpretation are also different. We provide a review of the current interpretable methods and divide them based on the model being applied. We divide them into two categories: interpretable methods with the self-explanatory model and interpretable methods with external co-explanation. And the interpretable methods with external co-explanation are further divided into subbranch methods based on instances, SHAP, knowledge graph, deep learning, and clustering model. The classification aims to help us understand the model characteristics applied in the interpretable method better. This survey makes the researcher find a suitable model to solve interpretability problems more easily. And the comparison experiments contribute to discovering complementary features from different methods. At the same time, we explore the future challenges and trends of interpretable machine learning to promote the development of interpretable machine learning.

[1]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[2]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Milan Stankovic,et al.  Enhancing explanations in recommender systems with knowledge graphs , 2018, SEMANTiCS.

[4]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[5]  Lior Rokach,et al.  Explainable decision forest: Transforming a decision forest into an interpretable tree , 2020, Inf. Fusion.

[6]  Maxine Eskénazi,et al.  Explainable Entity-based Recommendations with Knowledge Graphs , 2017, RecSys Posters.

[7]  Trevor Darrell,et al.  Generating Visual Explanations , 2016, ECCV.

[8]  Tat-Seng Chua,et al.  Neural Factorization Machines for Sparse Predictive Analytics , 2017, SIGIR.

[9]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[11]  Gang Wang,et al.  LEMNA: Explaining Deep Learning based Security Applications , 2018, CCS.

[12]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Quanshi Zhang,et al.  Interpreting CNN knowledge via an Explanatory Graph , 2017, AAAI.

[15]  Zhi-Hua Zhou,et al.  Extracting symbolic rules from trained neural network ensembles , 2003, AI Commun..

[16]  Marko Bohanec,et al.  Perturbation-Based Explanations of Prediction Models , 2018, Human and Machine Learning.

[17]  Nicholas Jing Yuan,et al.  Collaborative Knowledge Base Embedding for Recommender Systems , 2016, KDD.

[18]  Bolei Zhou,et al.  Interpretable Basis Decomposition for Visual Explanation , 2018, ECCV.

[19]  Matthijs van Leeuwen,et al.  Interpretable multiclass classification by MDL-based rule lists , 2019, Inf. Sci..

[20]  Shlomo Zilberstein,et al.  Balancing the Tradeoff Between Clustering Value and Interpretability , 2020, AIES.

[21]  Alessandro Bozzon,et al.  Recurrent knowledge graph embedding for effective recommendation , 2018, RecSys.

[22]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[23]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[24]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Larry A. Wasserman,et al.  SpAM: Sparse Additive Models , 2007, NIPS.

[26]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Michaël Aupetit,et al.  Visualizing distortions and recovering topology in continuous projection techniques , 2007, Neurocomputing.

[28]  Bolei Zhou,et al.  GAN Dissection: Visualizing and Understanding Generative Adversarial Networks , 2018, ICLR.

[29]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[32]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[33]  Yixin Cao,et al.  Explainable Reasoning over Knowledge Graphs for Recommendation , 2018, AAAI.

[34]  Alun D. Preece,et al.  Interpretability of deep learning models: A survey of results , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[35]  Olcay Boz,et al.  Extracting decision trees from trained neural networks , 2002, KDD.

[36]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Jun Wang,et al.  Explanation Mining: Post Hoc Interpretability of Latent Factor Models for Recommendation Systems , 2018, KDD.

[39]  Alexander Binder,et al.  Layer-Wise Relevance Propagation for Deep Neural Network Architectures , 2016 .

[40]  Sanjay Krishnan,et al.  PALM: Machine Learning Explanations For Iterative Debugging , 2017, HILDA@SIGMOD.

[41]  Pascal Hitzler,et al.  Explaining Trained Neural Networks with Semantic Web Technologies: First Steps , 2017, NeSy.

[42]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[43]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[44]  Lior Rokach,et al.  Explaining Anomalies Detected by Autoencoders Using SHAP , 2019, ArXiv.

[45]  Benoît Frénay,et al.  Interpretability of machine learning models and representations: an introduction , 2016, ESANN.

[46]  Scott M. Lundberg,et al.  Consistent Individualized Feature Attribution for Tree Ensembles , 2018, ArXiv.

[47]  Jaime S. Cardoso,et al.  Machine Learning Interpretability: A Survey on Methods and Metrics , 2019, Electronics.

[48]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[49]  Yu Zhang,et al.  Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.

[50]  Bolei Zhou,et al.  Interpreting Deep Visual Representations via Network Dissection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[53]  Jun Zhu,et al.  Analyzing the Noise Robustness of Deep Neural Networks , 2018, 2018 IEEE Conference on Visual Analytics Science and Technology (VAST).

[54]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Huajun Chen,et al.  Human-centric Transfer Learning Explanation via Knowledge Graph [Extended Abstract] , 2019, ArXiv.

[56]  Ricardo Fraiman,et al.  Interpretable clustering using unsupervised binary trees , 2011, Advances in Data Analysis and Classification.

[57]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[59]  Ioana Giurgiu,et al.  Additive Explanations for Anomalies Detected from Multivariate Temporal Data , 2019, CIKM.

[60]  Jeff Z. Pan,et al.  Enterprise Knowledge Graph: An Introduction , 2017, Exploiting Linked Data and Knowledge Graphs in Large Organisations.

[61]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[62]  Sharath M. Shankaranarayana,et al.  ALIME: Autoencoder Based Approach for Local Interpretability , 2019, IDEAL.

[63]  Isabelle Bichindaritz,et al.  Case-based reasoning in the health sciences: What's next? , 2006, Artif. Intell. Medicine.

[64]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.

[65]  Mukund Sundararajan,et al.  The many Shapley values for model explanation , 2019, ICML.

[66]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[67]  Christoph Kinkeldey,et al.  Towards Supporting Interpretability of Clustering Results with Uncertainty Visualization , 2019, TrustVis@EuroVis.

[68]  Fabian M. Suchanek,et al.  Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[69]  Andreas Krause,et al.  Streaming submodular maximization: massive data summarization on the fly , 2014, KDD.

[70]  Marvin S. Cohen,et al.  Metarecognition in Time-Stressed Decision Making: Recognizing, Critiquing, and Correcting , 1996, Hum. Factors.

[71]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[72]  Cynthia Rudin,et al.  Falling Rule Lists , 2014, AISTATS.

[73]  Mihaela van der Schaar,et al.  INVASE: Instance-wise Variable Selection using Neural Networks , 2018, ICLR.

[74]  M.H. Shenassa,et al.  CLoPAR: Classification based on Predictive Association Rules , 2006, 2006 3rd International IEEE Conference Intelligent Systems.

[75]  Rosane Minghim,et al.  Visual analysis of dimensionality reduction quality for parameterized projections , 2014, Comput. Graph..

[76]  Chandan Singh,et al.  Hierarchical interpretations for neural network predictions , 2018, ICLR.

[77]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[78]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[79]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[80]  Jascha Sohl-Dickstein,et al.  SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability , 2017, NIPS.

[81]  Yiqun Liu,et al.  Jointly Learning Explainable Rules for Recommendation with Knowledge Graph , 2019, WWW.

[82]  Eric D. Ragan,et al.  A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems , 2018, ACM Trans. Interact. Intell. Syst..

[83]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[84]  Jean-Marc Odobez,et al.  Unsupervised Interpretable Pattern Discovery in Time Series Using Autoencoders , 2016, S+SSPR.

[85]  Yaroslav Zharov,et al.  YASENN: Explaining Neural Networks via Partitioning Activation Sequences , 2018, ArXiv.

[86]  Bart Baesens,et al.  An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[87]  Dominik Janzing,et al.  Feature relevance quantification in explainable AI: A causality problem , 2019, AISTATS.

[88]  Xia Hu,et al.  Techniques for interpretable machine learning , 2018, Commun. ACM.

[89]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[90]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[91]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[92]  Brian Hobbs,et al.  Interpretable Clustering via Discriminative Rectangle Mixture Model , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[93]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[94]  Kangfeng Zheng,et al.  An Explainable Machine Learning Framework for Intrusion Detection Systems , 2020, IEEE Access.

[95]  Jeff Z. Pan,et al.  Exploiting Linked Data and Knowledge Graphs in Large Organisations , 2017 .

[96]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[97]  Tinne Tuytelaars,et al.  Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks , 2017, ICLR.

[98]  Dimitris Bertsimas,et al.  Interpretable Clustering via Optimal Trees , 2018, ArXiv.

[99]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[100]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[101]  R. Dennis Cook,et al.  Detection of Influential Observation in Linear Regression , 2000, Technometrics.

[102]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[103]  Pascal Hitzler,et al.  Efficient Concept Induction for Description Logics , 2018, AAAI.

[104]  Mihaela van der Schaar,et al.  Demystifying Black-box Models with Symbolic Metamodels , 2019, NeurIPS.