A Survey on the Explainability of Supervised Machine Learning

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or finance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.

[1]  Lloyd S. Shapley,et al.  Notes on the n-Person Game — II: The Value of an n-Person Game , 1951 .

[2]  J. Berkson A Statistically Precise and Relatively Simple Method of Estimating the Bio-Assay with Quantal Response, Based on the Logistic Function , 1953 .

[3]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[4]  J. L. Weiner,et al.  BLAH, A System Which Explains its Reasoning , 1980, Artif. Intell..

[5]  A. Tversky,et al.  The framing of decisions and the psychology of choice. , 1981, Science.

[6]  Timothy W. Finin,et al.  The need for user models in generating expert system explanation , 1988 .

[7]  D. Hilton Conversational processes and causal explanation. , 1990 .

[8]  Gerhard Fischer,et al.  Minimalist explanations in knowledge-based systems , 1990, Twenty-Third Annual Hawaii International Conference on System Sciences.

[9]  Eugene Charniak,et al.  Bayesian Networks without Tears , 1991, AI Mag..

[10]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[11]  Johanna D. Moore,et al.  Explanations in knowledge systems: design for explainable expert systems , 1991, IEEE Expert.

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  Andrew W. Mead Review of the Development of Multidimensional Scaling Methods , 1992 .

[14]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[15]  LiMin Fu,et al.  Rule Generation from Neural Networks , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[16]  Hongjun Lu,et al.  NeuroRule: A Connectionist Approach to Data Mining , 1995, VLDB.

[17]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[18]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[19]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[20]  Joydeep Ghosh,et al.  Three techniques for extracting rules from feedforward networks , 1996 .

[21]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[22]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[23]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[24]  Huan Liu,et al.  NeuroLinear: From neural networks to oblique decision rules , 1997, Neurocomputing.

[25]  Zhi-Hua Zhou,et al.  A statistics based approach for extracting priority rules from trained neural networks , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[26]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[27]  Tao Jiang Quasi-regression for visualization and interpretation of black box functions , 2002 .

[28]  Olcay Boz,et al.  Extracting decision trees from trained neural networks , 2002, KDD.

[29]  Zhi-Hua Zhou,et al.  Extracting symbolic rules from trained neural network ensembles , 2003, AI Commun..

[30]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[31]  Lars Niklasson,et al.  The Truth is In There - Rule Extraction from Opaque Models Using Genetic Programming , 2004, FLAIRS.

[32]  Paola Velardi,et al.  Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites , 2004, CL.

[33]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[34]  Pedro M. Domingos,et al.  Ontology Matching: A Machine Learning Approach , 2004, Handbook on Ontologies.

[35]  Joachim Diederich,et al.  Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets , 2004 .

[36]  Stefan Rüping,et al.  Learning with Local Models , 2004, Local Pattern Detection.

[37]  A. Meyer-Bäse Feature Selection and Extraction , 2004 .

[38]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[39]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[40]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[41]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[42]  Urszula Markowska-Kaczmar,et al.  Discovering the Mysteries of Neural Networks , 2004, Int. J. Hybrid Intell. Syst..

[43]  Michael van Lent,et al.  An Explainable Artificial Intelligence System for Small-unit Tactical Behavior , 2004, AAAI.

[44]  P. Healy Good relations. , 2005, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[45]  Lee Lacy,et al.  Defense Advanced Research Projects Agency (DARPA) Agent Markup Language Computer Aided Knowledge Acquisition , 2005 .

[46]  Paulo J. G. Lisboa,et al.  Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach , 2006, IEEE Transactions on Neural Networks.

[47]  Bart Baesens,et al.  ITER: An Algorithm for Predictive Regression Rule Extraction , 2006, DaWaK.

[48]  Stefan Rüping,et al.  Learning interpretable models , 2006 .

[49]  Yoichi Hayashi,et al.  Greedy rule generation from discrete data and its use in neural network rule extraction , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[50]  Muhammad Subianto,et al.  Understanding Discrete Classifiers with a Case Study in Gene Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[51]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[52]  Deborah L. McGuinness,et al.  PML 2: A Modular Explanation Interlingua , 2007, ExaCt.

[53]  Jonathan E. Fieldsend,et al.  Confident Interpretation of Bayesian Decision Tree Ensembles for Clinical Applications , 2007, IEEE Transactions on Information Technology in Biomedicine.

[54]  Andrew P. Bradley,et al.  Rule Extraction from Support Vector Machines: A Sequential Covering Approach , 2007, IEEE Transactions on Knowledge and Data Engineering.

[55]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[56]  Alexey Tsymbal,et al.  Ontology - Supported Machine Learning and Decision Support in Biomedicine , 2007, DILS.

[57]  Mark B. Sandler,et al.  The Music Ontology , 2007, ISMIR.

[58]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[59]  Bart Baesens,et al.  Recursive Neural Network Rule Extraction for Data With Mixed Attributes , 2008, IEEE Transactions on Neural Networks.

[60]  Bart Baesens,et al.  Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring , 2008, Rule Extraction from Support Vector Machines.

[61]  Bogdan E. Popescu,et al.  PREDICTIVE LEARNING VIA RULE ENSEMBLES , 2008, 0811.1679.

[62]  Glenn Fung,et al.  Rule Extraction from Linear Support Vector Machines via Mathematical Programming , 2008, Rule Extraction from Support Vector Machines.

[63]  Marko Robnik-Sikonja,et al.  Explaining Classifications For Individual Instances , 2008, IEEE Transactions on Knowledge and Data Engineering.

[64]  D. Matsumoto,et al.  The Cambridge Dictionary of Psychology , 2009 .

[65]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[66]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[67]  R. Tibshirani,et al.  Classification by Set Cover: The Prototype Vector Machine , 2009, 0908.2284.

[68]  N. Meinshausen Node harvest: simple and interpretable regression and classication , 2009, 0910.2145.

[69]  Erik Strumbelj,et al.  Explanation and reliability of prediction models: the case of breast cancer recurrence , 2010, Knowledge and Information Systems.

[70]  David R. Musicant,et al.  Understanding Support Vector Machine Classifications via a Recommender System-Like Approach , 2009, DMIN.

[71]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[72]  Robert Hoehndorf What is an upper level ontology , 2010 .

[73]  S. M. Kamruzzaman REx: An Efficient Rule Generator , 2010, ArXiv.

[74]  T. Kathirvalavakumar,et al.  Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems , 2011, Neural Processing Letters.

[75]  R. Tibshirani,et al.  Prototype selection for interpretable classification , 2011, 1202.5933.

[76]  Bart Baesens,et al.  An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[77]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[78]  Bart Baesens,et al.  Building comprehensible customer churn prediction models with advanced rule induction techniques , 2011, Expert Syst. Appl..

[79]  Paulo Cortez,et al.  Opening black box Data Mining models using Sensitivity Analysis , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[80]  Bart Baesens,et al.  Performance of classification models from a user perspective , 2011, Decis. Support Syst..

[81]  C. Rudin,et al.  Building Interpretable Classifiers with Rules using Bayesian Analysis , 2012 .

[82]  Cynthia Rudin,et al.  ORC: Ordered Rules for ClassificationA Discrete Optimization Approach to Associative Classification , 2012 .

[83]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.

[84]  Mohammed Bennamoun,et al.  Ontology learning from text: A look back and into the future , 2012, CSUR.

[85]  Bharat Mishra,et al.  Extended Taxonomy of Rule Extraction Techniques and Assessment of KDRuleEx , 2012 .

[86]  Simon Fong,et al.  Building a diseases symptoms ontology for medical diagnosis: An integrative approach , 2012, The First International Conference on Future Generation Communication Technologies.

[87]  Song-Chun Zhu,et al.  Learning AND-OR Templates for Object Recognition and Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[88]  Yoichi Hayashi,et al.  Neural Network Rule Extraction by a New Ensemble Concept and its Theoretical and Historical Background: a Review , 2013, Int. J. Comput. Intell. Appl..

[89]  Johannes Gehrke,et al.  Accurate intelligible models with pairwise interactions , 2013, KDD.

[90]  Artur Andrzejak,et al.  Interpretable models from distributed data via merging of decision trees , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[91]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[92]  Emil Pitkin,et al.  Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation , 2013, 1309.6392.

[93]  Madhuri Jha ANN-DT : An Algorithm for Extraction of Decision Trees from Artificial Neural Networks , 2013 .

[94]  Yoichi Hayashi,et al.  MofN rule extraction from neural networks trained with augmented discretized input , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[95]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[96]  Cynthia Rudin,et al.  Methods and Models for Interpretable Linear Classification , 2014, ArXiv.

[97]  Foster J. Provost,et al.  Explaining Data-Driven Document Classifications , 2013, MIS Q..

[98]  Panagiotis Papapetrou,et al.  A peek into the black box: exploring classifiers by randomization , 2014, Data Mining and Knowledge Discovery.

[99]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[100]  Cynthia Rudin,et al.  Falling Rule Lists , 2014, AISTATS.

[101]  Jialei Wang,et al.  Trading Interpretability for Accuracy: Oblique Treed Sparse Additive Models , 2015, KDD.

[102]  Cynthia Rudin,et al.  Supersparse linear integer models for optimized medical scoring systems , 2015, Machine Learning.

[103]  Finale Doshi-Velez,et al.  Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction , 2015, NIPS.

[104]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[105]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[106]  Kush R. Varshney,et al.  Interpretable Two-level Boolean Rule Learning for Classification , 2015, ArXiv.

[107]  Yimin Liu,et al.  Or's of And's for Interpretable Classification, with Application to Context-Aware Recommender Systems , 2015, ArXiv.

[108]  Yixin Chen,et al.  Optimal Action Extraction for Random Forests and Boosted Trees , 2015, KDD.

[109]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[110]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[111]  Jiangping Wang,et al.  Ontological Random Forests for Image Classification , 2015, Int. J. Inf. Retr. Res..

[112]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[113]  Kristin Branson,et al.  Understanding classifier errors by examining influential neighbors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[114]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[115]  David C. Kale,et al.  Modeling Missing Data in Clinical Time Series with RNNs , 2016 .

[116]  Olfa Nasraoui,et al.  Explainable Restricted Boltzmann Machines for Collaborative Filtering , 2016, ArXiv.

[117]  Anna Shcherbina,et al.  Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.

[118]  Dimitra Gkatzia,et al.  Natural Language Generation enhances human decision-making with uncertain information , 2016, ACL.

[119]  Regina Barzilay,et al.  Rationalizing Neural Predictions , 2016, EMNLP.

[120]  Kenney Ng,et al.  Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models , 2016, CHI.

[121]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[122]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[123]  Ryan Turner,et al.  A model explanation system , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[124]  Mouzhi Ge,et al.  Big Data Quality - Towards an Explanation Model in a Smart CityContext , 2016 .

[125]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[126]  Alex Alves Freitas,et al.  Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results , 2016, Evolutionary Computation.

[127]  Ankur Taly,et al.  Gradients of Counterfactuals , 2016, ArXiv.

[128]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[129]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[130]  Kush R. Varshney,et al.  Learning sparse two-level boolean rules , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[131]  Cynthia Rudin,et al.  Bayesian Rule Sets for Interpretable Classification , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[132]  Eneldo Loza Mencía,et al.  DeepRED - Rule Extraction from Deep Neural Networks , 2016, DS.

[133]  Trevor Darrell,et al.  Attentive Explanations: Justifying Decisions and Pointing to the Evidence , 2016, ArXiv.

[134]  Suresh Venkatasubramanian,et al.  Auditing Black-Box Models for Indirect Influence , 2016, ICDM.

[135]  Trevor Darrell,et al.  Generating Visual Explanations , 2016, ECCV.

[136]  Suresh Venkatasubramanian,et al.  Auditing black-box models for indirect influence , 2016, Knowledge and Information Systems.

[137]  Dallas Snider,et al.  IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics , 2016, JMIR public health and surveillance.

[138]  Seth Flaxman,et al.  EU regulations on algorithmic decision-making and a "right to explanation" , 2016, ArXiv.

[139]  Satoshi Hara,et al.  Making Tree Ensembles Interpretable , 2016, 1606.05390.

[140]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[141]  Pinki Roy,et al.  Rule Extraction from Training Data Using Neural Network , 2017, Int. J. Artif. Intell. Tools.

[142]  Or Biran,et al.  Explanation and Justification in Machine Learning : A Survey Or , 2017 .

[143]  Paul Voosen,et al.  How AI detectives are cracking open the black box of deep learning , 2017 .

[144]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[145]  Wen Phan,et al.  Machine Learning Interpretability with H 2 O Driverless , 2017 .

[146]  Marie-Jeanne Lesot,et al.  Inverse Classification for Comparison-based Interpretability in Machine Learning , 2017, ArXiv.

[147]  Kathleen McKeown,et al.  Human-Centric Justification of Machine Learning Predictions , 2017, IJCAI.

[148]  Cynthia Rudin,et al.  Optimized Risk Scores , 2017, KDD.

[149]  Zachary C. Lipton,et al.  The Doctor Just Won't Accept That! , 2017, 1711.08037.

[150]  Pascal Hitzler,et al.  Explaining Trained Neural Networks with Semantic Web Technologies: First Steps , 2017, NeSy.

[151]  Jure Leskovec,et al.  Interpretable & Explorable Approximations of Black Box Models , 2017, ArXiv.

[152]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[153]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[154]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[155]  Robin Gras,et al.  Rule Extraction from Decision Trees Ensembles: New Algorithms Based on Heuristic Search and Sparse Group Lasso Methods , 2017, Int. J. Inf. Technol. Decis. Mak..

[156]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[157]  Marko Bohanec,et al.  Explaining machine learning models in sales predictions , 2017, Expert Syst. Appl..

[158]  Venkat N. Gudivada,et al.  Data Quality Considerations for Big Data and Machine Learning: Going Beyond Data Cleaning and Transformations , 2017 .

[159]  Osbert Bastani,et al.  Interpreting Blackbox Models via Model Extraction , 2017, ArXiv.

[160]  Foster J. Provost,et al.  Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals , 2016, Big Data.

[161]  Olfa Nasraoui,et al.  Using Explainability for Constrained Matrix Factorization , 2017, RecSys.

[162]  Adrian Weller,et al.  Challenges for Transparency , 2017, ArXiv.

[163]  Bernease Herman,et al.  The Promise and Peril of Human Evaluation for Model Interpretability , 2017, ArXiv.

[164]  Derek Doran,et al.  What Does Explainable AI Really Mean? A New Conceptualization of Perspectives , 2017, CEx@AI*IA.

[165]  Karthik S. Gurumoorthy,et al.  ProtoDash: Fast Interpretable Prototype Selection , 2017, ArXiv.

[166]  Kush R. Varshney,et al.  Learning Interpretable Classification Rules with Boolean Compressed Sensing , 2017 .

[167]  Geoffrey E. Hinton,et al.  Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.

[168]  Kai Puolamäki,et al.  Interpreting Classifiers through Attribute Interactions in Datasets , 2017, ArXiv.

[169]  Fabrizio Silvestri,et al.  Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking , 2017, KDD.

[170]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[171]  Randall Balestriero,et al.  Neural Decision Trees , 2017, ArXiv.

[172]  Margo I. Seltzer,et al.  Scalable Bayesian Rule Lists , 2016, ICML.

[173]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[174]  Samy Bengio,et al.  Context-Aware Captions from Context-Agnostic Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[175]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[176]  William Nick Street,et al.  A Budget-Constrained Inverse Classification Framework for Smooth Classifiers , 2016, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[177]  Andreas Holzinger,et al.  Interactive machine learning: experimental evidence for the human in the algorithmic loop , 2018, Applied Intelligence.

[178]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[179]  Mike Wu,et al.  Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[180]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[181]  Richard L. Phillips,et al.  Interpretable Active Learning , 2018, FAT.

[182]  Amina Adadi,et al.  Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[183]  Sanjay Ranka,et al.  Global Model Interpretation Via Recursive Partitioning , 2018, 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[184]  Denali Molitor,et al.  Model Agnostic Supervised Local Explanations , 2018, NeurIPS.

[185]  Cynthia Rudin,et al.  Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the "Rashomon" Perspective , 2018 .

[186]  Emily Chen,et al.  How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation , 2018, ArXiv.

[187]  Joaquin Vanschoren,et al.  ML-Schema: Exposing the Semantics of Machine Learning with Schemas and Ontologies , 2018, ICML 2018.

[188]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[189]  Yongxin Yang,et al.  Deep Neural Decision Trees , 2018, ArXiv.

[190]  Marie-Jeanne Lesot,et al.  Defining Locality for Surrogates in Post-hoc Interpretablity , 2018, ICML 2018.

[191]  Gary Klein,et al.  Metrics for Explainable AI: Challenges and Prospects , 2018, ArXiv.

[192]  Cynthia Rudin,et al.  Please Stop Explaining Black Box Models for High Stakes Decisions , 2018, ArXiv.

[193]  Filip Karlo Dosilovic,et al.  Explainable artificial intelligence: A survey , 2018, 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[194]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[195]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[196]  Alessandro Rinaldo,et al.  Distribution-Free Predictive Inference for Regression , 2016, Journal of the American Statistical Association.

[197]  Huajun Chen,et al.  Knowledge-based Transfer Learning Explanation , 2018, KR.

[198]  Franco Turini,et al.  Local Rule-Based Explanations of Black Box Decision Systems , 2018, ArXiv.

[199]  Chris Russell,et al.  Efficient Search for Diverse Coherent Explanations , 2019, FAT.

[200]  Wojciech Samek,et al.  Explainable AI: Interpreting, Explaining and Visualizing Deep Learning , 2019, Explainable AI.

[201]  Amit Sharma,et al.  Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers , 2019, ArXiv.

[202]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[203]  Marco F. Huber,et al.  A Study on Trust in Black Box Models and Post-hoc Explanations , 2019, SOCO.

[204]  Huajun Chen,et al.  Human-centric Transfer Learning Explanation via Knowledge Graph [Extended Abstract] , 2019, ArXiv.

[205]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[206]  Adrian Weller,et al.  Transparency: Motivations and Challenges , 2019, Explainable AI.

[207]  Andreas Holzinger,et al.  KANDINSKY Patterns as IQ-Test for Machine Learning , 2019, CD-MAKE.

[208]  Marco F. Huber,et al.  Enhancing Decision Tree Based Interpretation of Deep Neural Networks through L1-Orthogonal Regularization , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[209]  Dik Lun Lee,et al.  iForest: Interpreting Random Forests via Visual Analytics , 2019, IEEE Transactions on Visualization and Computer Graphics.

[210]  Ming Yin,et al.  Understanding the Effect of Accuracy on Trust in Machine Learning Models , 2019, CHI.

[211]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[212]  Nadia Burkart,et al.  Forcing Interpretability for Deep Neural Networks through Rule-Based Regularization , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[213]  Camelia-Mihaela Pintea,et al.  A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop , 2017, Creative Mathematics and Informatics.

[214]  Yuval Pinter,et al.  Attention is not not Explanation , 2019, EMNLP.

[215]  Thales XAI Platform: Adaptable Explanation of Machine Learning Systems - A Knowledge Graphs Perspective , 2019, SEMWEB.

[216]  Tillman Weyde,et al.  An Ontology-based Approach to Explaining Artificial Neural Networks , 2019, ArXiv.

[217]  Charu C. Aggarwal,et al.  Efficient Data Representation by Selecting Prototypes with Importance Weights , 2017, 2019 IEEE International Conference on Data Mining (ICDM).

[218]  Jure Leskovec,et al.  Faithful and Customizable Explanations of Black Box Models , 2019, AIES.

[219]  José M. F. Moura,et al.  Towards Aggregating Weighted Feature Attributions , 2019, ArXiv.

[220]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[221]  Felix Bießmann,et al.  Quantifying Interpretability and Trust in Machine Learning Systems , 2019, ArXiv.

[222]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[223]  Megan Kurka,et al.  Machine Learning Interpretability with H2O Driverless AI , 2019 .

[224]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[225]  Xu Chen,et al.  Explainable Recommendation: A Survey and New Perspectives , 2018, Found. Trends Inf. Retr..

[226]  Till Döhmen,et al.  DeepCOVIDExplainer: Explainable COVID-19 Predictions Based on Chest X-ray Images , 2020, ArXiv.

[227]  Plamen Angelov,et al.  Towards Explainable Deep Neural Networks (xDNN) , 2019, Neural Networks.

[228]  Y. Chen,et al.  An Interpretable Machine Learning Framework for Accurate Severe vs Non-severe COVID-19 Clinical Type Classification , 2020, medRxiv.

[229]  Ming Xu,et al.  Triaging moderate COVID-19 and other viral pneumonias from routine blood tests , 2020, ArXiv.

[230]  Dino Pedreschi,et al.  Doctor XAI: an ontology-based approach to black-box sequential data classification explanations , 2020, FAT*.

[231]  Thomas C. Henderson,et al.  An Investigation of COVID-19 Spreading Factors with Explainable AI Techniques , 2020, ArXiv.

[232]  Peter A. Flach,et al.  FACE: Feasible and Actionable Counterfactual Explanations , 2019, AIES.

[233]  Daniel G. Goldstein,et al.  Manipulating and Measuring Model Interpretability , 2018, CHI.

[234]  Cuntai Guan,et al.  A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[235]  Janis Klaise,et al.  Interpretable Counterfactual Explanations Guided by Prototypes , 2019, ECML/PKDD.