Explainable AI: Interpreting, Explaining and Visualizing Deep Learning

In recent years, machine learning (ML) has become a key enabling technology for the sciences and industry. Especially through improvements in methodology, the availability of large databases and increased computational power, today’s ML algorithms are able to achieve excellent performance (at times even exceeding the human level) on an increasing number of complex tasks. Deep learning models are at the forefront of this development. However, due to their nested nonlinear structure, these powerful models have been generally considered “black boxes”, not providing any information about what exactly makes them arrive at their predictions. Since in many applications, e.g., in the medical domain, such lack of transparency may be not acceptable, the development of methods for visualizing, explaining and interpreting deep learning models has recently attracted increasing attention. This introductory paper presents recent developments and applications in this field and makes a plea for a wider use of explainable learning algorithms in

[1]  L. S. Shapley,et al.  17. A Value for n-Person Games , 1953 .

[2]  Frank Fallside,et al.  Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .

[3]  Johanna D. Moore,et al.  Explanation in second generation expert systems , 1993 .

[4]  Lars Kai Hansen,et al.  Visualization of neural networks using saliency maps , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[5]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[6]  Rainer Goebel,et al.  Information-based functional brain mapping. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Pedro Antunes,et al.  Structuring dimensions for collaborative systems evaluation , 2012, CSUR.

[8]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[9]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[10]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[12]  Jürgen Schmidhuber,et al.  A committee of neural networks for traffic sign classification , 2011, The 2011 International Joint Conference on Neural Networks.

[13]  Klaus-Robert Müller,et al.  Introduction to machine learning for brain imaging , 2011, NeuroImage.

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[16]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[17]  Sanguthevar Rajasekaran,et al.  Accelerating materials property predictions using machine learning , 2013, Scientific Reports.

[18]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[19]  Andrew W. Senior,et al.  Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.

[20]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[21]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[22]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[23]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[25]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[26]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[27]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[28]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[29]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[30]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[31]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[32]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[33]  Andrea Vedaldi,et al.  Understanding Image Representations by Measuring Their Equivariance and Equivalence , 2014, International Journal of Computer Vision.

[34]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[37]  William Stafford Noble,et al.  Machine learning applications in genetics and genomics , 2015, Nature Reviews Genetics.

[38]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Alexander Mordvintsev,et al.  Inceptionism: Going Deeper into Neural Networks , 2015 .

[41]  Xiaoou Tang,et al.  Surpassing Human-Level Face Verification Performance on LFW with GaussianFace , 2014, AAAI.

[42]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[43]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[45]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[46]  Zhe L. Lin,et al.  Top-Down Neural Attention by Excitation Backprop , 2016, ECCV.

[47]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Anna Shcherbina,et al.  Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.

[49]  Alexander Binder,et al.  Analyzing Classifiers: Fisher Vectors and Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Daniel Jurafsky,et al.  Understanding Neural Networks through Representation Erasure , 2016, ArXiv.

[51]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[52]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[53]  Francesco Bonchi,et al.  Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining , 2016, KDD.

[54]  Ryan Turner,et al.  A model explanation system , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[55]  Andrea Vedaldi,et al.  Salient Deconvolutional Networks , 2016, ECCV.

[56]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[57]  Alexander Binder,et al.  Layer-Wise Relevance Propagation for Deep Neural Network Architectures , 2016 .

[58]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[59]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[60]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[61]  Klaus-Robert Müller,et al.  Interpretable deep neural networks for single-trial EEG classification , 2016, Journal of Neuroscience Methods.

[62]  A. Ritter Human Communication Theory And Research Concepts Contexts And Challenges , 2016 .

[63]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[65]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[66]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[67]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[68]  U. Austin,et al.  Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2017 .

[69]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[70]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[71]  Shujian Huang,et al.  Deep Matrix Factorization Models for Recommender Systems , 2017, IJCAI.

[72]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[73]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[74]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[75]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Klaus-Robert Müller,et al.  "What is relevant in a text document?": An interpretable machine learning approach , 2016, PloS one.

[77]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[78]  Kevin Waugh,et al.  DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.

[79]  David Weinberger,et al.  Accountability of AI Under the Law: The Role of Explanation , 2017, ArXiv.

[80]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[81]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[82]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[83]  Dawn Song,et al.  Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[84]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[85]  Klaus-Robert Müller,et al.  Explaining Recurrent Neural Network Predictions in Sentiment Analysis , 2017, WASSA@EMNLP.

[86]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[87]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[88]  Masaru Ishii,et al.  Towards computational fluorescence microscopy: Machine learning-based integrated prediction of morphological and molecular tumor profiles , 2018, ArXiv.

[89]  Hinrich Schütze,et al.  Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement , 2018, ACL.

[90]  Klaus-Robert Müller,et al.  Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[91]  K. Müller,et al.  Towards exact molecular dynamics simulations with machine-learned force fields , 2018, Nature Communications.

[92]  Dong Nguyen,et al.  Comparing Automatic and Human Evaluation of Local Explanations for Text Classification , 2018, NAACL.

[93]  Pablo A. Estévez,et al.  Enhanced Rotational Invariant Convolutional Neural Network for Supernovae Detection , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[94]  Emily Chen,et al.  How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation , 2018, ArXiv.

[95]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[96]  Klaus-Robert Müller,et al.  Structuring Neural Networks for More Explainable Predictions , 2018 .

[97]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[98]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[99]  Angkoon Phinyomark,et al.  Analysis of Big Data in Gait Biomechanics: Current Trends and Future Directions , 2017, Journal of Medical and Biological Engineering.

[100]  Ivan Tashev,et al.  Spatial Audio Feature Discovery with Convolutional Neural Networks , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[101]  Jonathan Dodge,et al.  Visualizing and Understanding Atari Agents , 2017, ICML.

[102]  Kate Saenko,et al.  RISE: Randomized Input Sampling for Explanation of Black-box Models , 2018, BMVC.

[103]  Volker Tresp,et al.  Explaining Therapy Predictions with Layer-Wise Relevance Propagation in Neural Networks , 2018, 2018 IEEE International Conference on Healthcare Informatics (ICHI).

[104]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[105]  Andrea Vedaldi,et al.  Explanations for Attributing Deep Neural Network Predictions , 2019, Explainable AI.

[106]  Lei Wang,et al.  Solving Statistical Mechanics using Variational Autoregressive Networks , 2018, Physical review letters.

[107]  Michael Scheel,et al.  Uncovering convolutional neural network decisions for diagnosing multiple sclerosis on conventional MRI using layer-wise relevance propagation , 2019, NeuroImage: Clinical.

[108]  Alexander Binder,et al.  Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[109]  Bolei Zhou,et al.  Comparing the Interpretability of Deep Networks via Network Dissection , 2019, Explainable AI.

[110]  Adrian Weller,et al.  Transparency: Motivations and Challenges , 2017, Explainable AI.

[111]  Georg Langs,et al.  Causability and explainability of artificial intelligence in medicine , 2019, WIREs Data Mining Knowl. Discov..

[112]  Sepp Hochreiter,et al.  Visual Scene Understanding for Autonomous Driving Using Semantic Segmentation , 2019, Explainable AI.

[113]  High-Level Expert Group on Artificial Intelligence – Draft Ethics Guidelines for Trustworthy AI , 2019 .

[114]  Wojciech Samek,et al.  Explaining and Interpreting LSTMs , 2019, Explainable AI.

[115]  Friedrich Rippmann,et al.  Interpretable Deep Learning in Drug Discovery , 2019, Explainable AI.

[116]  Emmanuel Vincent,et al.  CRNN-Based Multiple DoA Estimation Using Acoustic Intensity Features for Ambisonics Recordings , 2019, IEEE Journal of Selected Topics in Signal Processing.

[117]  Klaus-Robert Müller,et al.  Evaluating Recurrent Neural Network Explanations , 2019, BlackboxNLP@ACL.

[118]  Jason Yosinski,et al.  Understanding Neural Networks via Feature Visualization: A survey , 2019, Explainable AI.

[119]  Klaus-Robert Müller,et al.  From Clustering to Cluster Explanations via Neural Networks , 2019, IEEE transactions on neural networks and learning systems.

[120]  Adv Anat Pathol Scoring of tumor-infiltrating lymphocytes : from visual estimation to machine learning , 2019 .

[121]  Sepp Hochreiter,et al.  RUDDER: Return Decomposition for Delayed Rewards , 2018, NeurIPS.

[122]  Klaus-Robert Müller,et al.  Explaining the unique nature of individual gait patterns with deep learning , 2018, Scientific Reports.

[123]  Sebastian Lapuschkin,et al.  Opening the machine learning black box with Layer-wise Relevance Propagation , 2019 .

[124]  Markus H. Gross,et al.  Gradient-Based Attribution Methods , 2019, Explainable AI.

[125]  Oluwasanmi Koyejo,et al.  Interpreting Black Box Predictions using Fisher Kernels , 2018, AISTATS.

[126]  Klaus-Robert Müller,et al.  iNNvestigate neural networks! , 2018, J. Mach. Learn. Res..

[127]  Wojciech Samek,et al.  Analyzing Neuroimaging Data Through Recurrent Deep Learning Models , 2018, Front. Neurosci..

[128]  Klaus-Robert Müller,et al.  Layer-Wise Relevance Propagation: An Overview , 2019, Explainable AI.

[129]  Klaus-Robert Müller,et al.  Towards Explaining Anomalies: A Deep Taylor Decomposition of One-Class Models , 2018, Pattern Recognit..