Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

In the last years, Artificial Intelligence (AI) has achieved a notable momentum that may deliver the best of expectations over many application sectors across the field. For this to occur, the entire community stands in front of the barrier of explainability, an inherent problem of AI techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI. Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is acknowledged as a crucial feature for the practical deployment of AI models. This overview examines the existing literature in the field of XAI, including a prospect toward what is yet to be reached. We summarize previous efforts to define explainability in Machine Learning, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought. We then propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at Deep Learning methods for which a second taxonomy is built. This literature analysis serves as the background for a series of challenges faced by XAI, such as the crossroads between data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to XAI with a reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.

[1]  Gary Koop,et al.  Bayesian Econometric Methods , 2007 .

[2]  Pengcheng Shi,et al.  Active Model with Orthotropic Hyperelastic Material for Cardiac Image Analysis , 2009, FIMH.

[3]  Zhen Li,et al.  Towards Better Analysis of Deep Convolutional Neural Networks , 2016, IEEE Transactions on Visualization and Computer Graphics.

[4]  Giles Hooker,et al.  Discovering additive structure in black box functions , 2004, KDD.

[5]  Gerbrand Ceder,et al.  Predicting crystal structure by merging data mining with quantum mechanics , 2006, Nature materials.

[6]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[7]  Andreu Català,et al.  Support vector machines with symbolic interpretation , 2002, VII Brazilian Symposium on Neural Networks, 2002. SBRN 2002. Proceedings..

[8]  Pratik Gajane,et al.  On formalizing fairness in prediction with machine learning , 2017, ArXiv.

[9]  Jintao Li,et al.  Data fusion in cyber-physical-social systems: State-of-the-art and perspectives , 2019, Inf. Fusion.

[10]  Cathy O'Neil,et al.  Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification , 2017, Big Data.

[11]  Foster J. Provost,et al.  Explaining Data-Driven Document Classifications , 2013, MIS Q..

[12]  Joel Z. Leibo,et al.  View-Tolerant Face Recognition and Hebbian Learning Imply Mirror-Symmetric Neural Tuning to Head Orientation , 2016, Current Biology.

[13]  Seong Joon Oh,et al.  Faceless Person Recognition: Privacy Implications in Social Media , 2016, ECCV.

[14]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[16]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[17]  Elizabeth Walter,et al.  Cambridge advanced learner's dictionary , 2005 .

[18]  J. Sitzia,et al.  Good practice in the conduct and reporting of survey research. , 2003, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[19]  H. Chad Lane,et al.  Explainable Artificial Intelligence for Training and Tutoring , 2005, AIED.

[20]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[21]  Mark Craven,et al.  Extracting comprehensible models from trained neural networks , 1996 .

[22]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[23]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[24]  M. Mcallister,et al.  Bayesian stock assessment: a review and example application using the logistic model , 1998 .

[25]  Seong Joon Oh,et al.  Towards Reverse-Engineering Black-Box Neural Networks , 2017, ICLR.

[26]  Krishna P. Gummadi,et al.  A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual &Group Unfairness via Inequality Indices , 2018, KDD.

[27]  Alexander Binder,et al.  Controlling explanatory heatmap resolution and semantics via decomposition depth , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[28]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[29]  Jichen Zhu,et al.  Explainable AI for Designers: A Human-Centered Perspective on Mixed-Initiative Co-Creation , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[30]  Zhi-Hua Zhou,et al.  Extracting symbolic rules from trained neural network ensembles , 2003, AI Commun..

[31]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  G Gigerenzer,et al.  Using natural frequencies to improve diagnostic inferences , 1998, Academic medicine : journal of the Association of American Medical Colleges.

[33]  Lars Niklasson,et al.  Accuracy vs. comprehensibility in data mining models , 2004 .

[34]  Jianping Li,et al.  A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue , 2007, Artif. Intell. Medicine.

[35]  Graham K. Rand,et al.  Quantitative Applications in the Social Sciences , 1983 .

[36]  Raphaël Féraud,et al.  A methodology to explain neural network classification , 2002, Neural Networks.

[37]  Scott Lundberg,et al.  Explaining Models by Propagating Shapley Values of Local Components , 2019, Explainable AI in Healthcare and Medicine.

[38]  D. Ruppert Robust Statistics: The Approach Based on Influence Functions , 1987 .

[39]  Bernd Carsten Stahl,et al.  Ethics and Privacy in AI and Big Data: Implementing Responsible Research and Innovation , 2018, IEEE Security & Privacy.

[40]  Jure Leskovec,et al.  Interpretable & Explorable Approximations of Black Box Models , 2017, ArXiv.

[41]  Kenji Kawaguchi,et al.  Deep Learning without Poor Local Minima , 2016, NIPS.

[42]  David Filliat,et al.  DisCoRL: Continual Reinforcement Learning via Policy Distillation , 2019, ArXiv.

[43]  S. Božič,et al.  A modified Geosite Assessment Model (M-GAM) and its Application on the Lazar Canyon area (Serbia) , 2014 .

[44]  Tim Miller,et al.  Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences , 2017, ArXiv.

[45]  Przemyslaw Biecek,et al.  Explanations of model predictions with live and breakDown packages , 2018, R J..

[46]  Marcel van Gerven,et al.  Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges , 2018, ArXiv.

[47]  Ignacio Requena,et al.  Are artificial neural networks black boxes? , 1997, IEEE Trans. Neural Networks.

[48]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[49]  R. Krishnan,et al.  Extracting decision trees from trained neural networks , 1999, Pattern Recognit..

[50]  Karen M. Feigh,et al.  Learning From Explanations Using Sentiment and Advice in RL , 2017, IEEE Transactions on Cognitive and Developmental Systems.

[51]  Bart Baesens,et al.  An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[52]  Raymond J. Mooney,et al.  Stacking with Auxiliary Features for Visual Question Answering , 2018, NAACL.

[53]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[54]  Geoffrey E. Hinton,et al.  Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.

[55]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[56]  Arvind Satyanarayan,et al.  The Building Blocks of Interpretability , 2018 .

[57]  Sebastian Thrun,et al.  Extracting Rules from Artifical Neural Networks with Distributed Representations , 1994, NIPS.

[58]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[59]  Joydeep Ghosh,et al.  Symbolic Interpretation of Artificial Neural Networks , 1999, IEEE Trans. Knowl. Data Eng..

[60]  Giles Hooker,et al.  Tree Space Prototypes: Another Look at Making Tree Ensembles Interpretable , 2016, FODS.

[61]  Anthony Hunter,et al.  Elements of Argumentation , 2007, ECSQARU.

[62]  Bernard Haasdonk,et al.  Feature space interpretation of SVMs with indefinite kernels , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  Gina Neff,et al.  Critique and Contribute: A Practice-Based Framework for Improving Critical Data Studies and Data Science , 2017, Big Data.

[64]  Yu-Ru Lin,et al.  FairSight: Visual Analytics for Fairness in Decision Making , 2019, IEEE Transactions on Visualization and Computer Graphics.

[65]  Shengyi Jiang,et al.  An improved K-nearest-neighbor algorithm for text categorization , 2012, Expert Syst. Appl..

[66]  Max Welling,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.

[67]  Bernt Schiele,et al.  A Hybrid Model for Identity Obfuscation by Face Replacement , 2018, ECCV.

[68]  Bo Zhang,et al.  Improving Interpretability of Deep Neural Networks with Semantic Information , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[70]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[71]  Xiuju Fu,et al.  Extracting the knowledge embedded in support vector machines , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[72]  David Leslie,et al.  Understanding artificial intelligence ethics and safety , 2019, ArXiv.

[73]  Jude W. Shavlik,et al.  Using Sampling and Queries to Extract Rules from Trained Neural Networks , 1994, ICML.

[74]  Ender Konukoglu,et al.  Visual Feature Attribution Using Wasserstein GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[75]  Houtao Deng,et al.  Interpreting tree ensembles with inTrees , 2018, International Journal of Data Science and Analytics.

[76]  Jude W. Shavlik,et al.  Extracting Refined Rules from Knowledge-Based Neural Networks , 1993, Machine Learning.

[77]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[78]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Lior Rokach,et al.  Data Mining with Decision Trees - Theory and Applications , 2007, Series in Machine Perception and Artificial Intelligence.

[80]  Bert Leuridan,et al.  General theories of explanation: buyer beware , 2011, Synthese.

[81]  Aaron C. Courville,et al.  Understanding Representations Learned in Deep Architectures , 2010 .

[82]  Peter Sollich,et al.  Probabilistic Methods for Support Vector Machines , 1999, NIPS.

[83]  Michael Veale,et al.  Slave to the Algorithm? Why a 'Right to an Explanation' Is Probably Not the Remedy You Are Looking For , 2017 .

[84]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[85]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[86]  Bernd Bischl,et al.  Visualizing the Feature Importance for Black Box Models , 2018, ECML/PKDD.

[87]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[88]  B. Winblad,et al.  Smoking and the Occurence of Alzheimer's Disease: Cross-Sectional and Longitudinal Data in a Population-based Study , 1999 .

[89]  Sanjay Krishnan,et al.  PALM: Machine Learning Explanations For Iterative Debugging , 2017, HILDA@SIGMOD.

[90]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[91]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[92]  Cynthia Rudin,et al.  Please Stop Explaining Black Box Models for High Stakes Decisions , 2018, ArXiv.

[93]  Dawn Song,et al.  Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[94]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[95]  Cuntai Guan,et al.  A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[96]  Lars Niklasson,et al.  G-REX: A Versatile Framework for Evolutionary Data Mining , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[97]  Fabrizio Silvestri,et al.  Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking , 2017, KDD.

[98]  Zachary C. Lipton,et al.  The mythos of model interpretability , 2018, Commun. ACM.

[99]  Dawn M. Tilbury,et al.  Explanations and Expectations: Trust Building in Automated Vehicles , 2018, HRI.

[100]  Heiko Hoffmann,et al.  Explainability Methods for Graph Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[101]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[102]  I. Jolliffe Principal Component Analysis and Factor Analysis , 1986 .

[103]  Rich Caruana,et al.  Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation , 2017, AIES.

[104]  Herbert A. Simon,et al.  Applications of machine learning and rule induction , 1995, CACM.

[105]  Lars Niklasson,et al.  The Truth is In There - Rule Extraction from Opaque Models Using Genetic Programming , 2004, FLAIRS.

[106]  Javier Del Ser,et al.  Data fusion and machine learning for industrial prognosis: Trends and perspectives towards Industry 4.0 , 2019, Inf. Fusion.

[107]  Carlos Guestrin,et al.  Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance , 2016, ArXiv.

[108]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[109]  Bob L. Sturm,et al.  Local Interpretable Model-Agnostic Explanations for Music Content Analysis , 2017, ISMIR.

[110]  Ivan Donadello,et al.  Persuasive Explanation of Reasoning Inferences on Dietary Data , 2019, PROFILES/SEMEX@ISWC.

[111]  Melanie Mitchell,et al.  Interpreting individual classifications of hierarchical networks , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[112]  Eneldo Loza Mencía,et al.  DeepRED - Rule Extraction from Deep Neural Networks , 2016, DS.

[113]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[114]  Patrick D. McDaniel,et al.  Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[115]  Ilya Sutskever,et al.  Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[116]  Casey S. Greene,et al.  Unsupervised Feature Construction and Knowledge Extraction from Genome-Wide Assays of Breast Cancer with Denoising Autoencoders , 2014, Pacific Symposium on Biocomputing.

[117]  Raffaella Calabrese,et al.  Estimating bank loans loss given default by generalized additive models , 2012 .

[118]  Richard A. Berk,et al.  Statistical Procedures for Forecasting Criminal Behavior , 2013 .

[119]  Illtyd Trethowan Causality , 1938 .

[120]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[121]  Joachim Diederich,et al.  Eclectic Rule-Extraction from Support Vector Machines , 2005 .

[122]  Leslie Pack Kaelbling,et al.  Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[123]  Pat Langley,et al.  Explainable Agency for Intelligent Autonomous Systems , 2017, AAAI.

[124]  Paulo J. G. Lisboa,et al.  Making machine learning models interpretable , 2012, ESANN.

[125]  N. Wiratunga,et al.  Towards Explainable Text Classification by Jointly Learning Lexicon and Modifier Terms , 2017 .

[126]  Mark J. F. Gales,et al.  Improving the interpretability of deep neural networks with stimulated learning , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[127]  Madhuri Jha ANN-DT : An Algorithm for Extraction of Decision Trees from Artificial Neural Networks , 2013 .

[128]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[129]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[130]  Bart Baesens,et al.  Performance of classification models from a user perspective , 2011, Decis. Support Syst..

[131]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[132]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[133]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[134]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[135]  Thomas Richardson,et al.  Interpretable Boosted Naïve Bayes Classification , 1998, KDD.

[136]  Yu Zhang,et al.  Plan explicability and predictability for robot task planning , 2015, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[137]  U. Johansson,et al.  International Conference on Information Fusion ( FUSION ) Automatically Balancing Accuracy and Comprehensibility in Predictive Modeling , 2006 .

[138]  Murray Shanahan,et al.  Reconciling deep learning with symbolic artificial intelligence: representing objects and relations , 2019, Current Opinion in Behavioral Sciences.

[139]  Xuelong Li,et al.  Feature selection with multi-view data: A survey , 2019, Inf. Fusion.

[140]  Plamen Angelov,et al.  Fair-by-design explainable models for prediction of recidivism , 2019, ArXiv.

[141]  Yarin Gal,et al.  Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.

[142]  Jun Wang,et al.  Working with Beliefs: AI Transparency in the Enterprise , 2018, IUI Workshops.

[143]  Rich Caruana,et al.  Case-Based Explanation for Artificial Neural Nets , 2000, ANNIMAB.

[144]  Tommaso Di Noia,et al.  Knowledge-aware Autoencoders for Explainable Recommender Systems , 2018, DLRS@RecSys.

[145]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[146]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[147]  Suresh Venkatasubramanian,et al.  Auditing black-box models for indirect influence , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[148]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[149]  Kurt Bollacker,et al.  Extending Knowledge Graphs with Subjective Influence Networks for Personalized Fashion , 2018, Designing Cognitive Cities.

[150]  Ivan Bratko,et al.  Nomograms for visualizing support vector machines , 2005, KDD '05.

[151]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[152]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[153]  Joachim Diederich,et al.  The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks , 1998, IEEE Trans. Neural Networks.

[154]  Peter Sollich,et al.  Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities , 2002, Machine Learning.

[155]  Divesh Srivastava,et al.  Big data integration , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[156]  Yuren Zhou,et al.  A survey of data fusion in smart city applications , 2019, Inf. Fusion.

[157]  Luc De Raedt,et al.  DeepProbLog: Neural Probabilistic Logic Programming , 2018, BNAIC/BENELEARN.

[158]  Sabine Van Huffel,et al.  Comparing Analytical Decision Support Models Through Boolean Rule Extraction: A Case Study of Ovarian Tumour Malignancy , 2007, ISNN.

[159]  Yan Liu,et al.  Distilling Knowledge from Deep Networks with Applications to Healthcare Domain , 2015, ArXiv.

[160]  Freddy Lécué,et al.  Explainable AI: The New 42? , 2018, CD-MAKE.

[161]  Patrick Hall,et al.  On the Art and Science of Machine Learning Explanations , 2018, ArXiv.

[162]  Charles Kemp,et al.  Bayesian models of cognition , 2008 .

[163]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[164]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[165]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[166]  Paulo Cortez,et al.  Opening black box Data Mining models using Sensitivity Analysis , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[167]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[168]  Seong Joon Oh,et al.  I-Pic: A Platform for Privacy-Compliant Image Capture , 2016, MobiSys.

[169]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[170]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[171]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[172]  Vaishak Belle,et al.  Logic meets Probability: Towards Explainable AI Systems for Uncertain Worlds , 2017, IJCAI.

[173]  Jing Huang,et al.  Interpretable Convolutional Neural Networks with Dual Local and Global Attention for Review Rating Prediction , 2017, RecSys.

[174]  Julian D. Olden,et al.  Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[175]  Or Biran,et al.  Explanation and Justification in Machine Learning : A Survey Or , 2017 .

[176]  Linwei Wang,et al.  Robust Transmural Electrophysiological Imaging: Integrating Sparse and Dynamic Physiological Models into ECG-Based Inference , 2015, MICCAI.

[177]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[178]  Un Desa Transforming our world : The 2030 Agenda for Sustainable Development , 2016 .

[179]  Ricardo Tanscheit,et al.  Fuzzy rule extraction from support vector machines , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[180]  Tameru Hailesilassie,et al.  Rule Extraction Algorithm for Deep Neural Networks: A Review , 2016, ArXiv.

[181]  Emilio Parrado-Hernández,et al.  Support vector machine interpretation , 2006, Neurocomputing.

[182]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[183]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[184]  Wee Kheng Leow,et al.  FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks , 2004, Applied Intelligence.

[185]  Lars Niklasson,et al.  Evolving decision trees using oracle guides , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[186]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[187]  Graham W. Taylor,et al.  Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.

[188]  Changchun Liu,et al.  An empirical study of machine learning techniques for affect recognition in human–robot interaction , 2006, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[189]  Claudia Eckert,et al.  Support vector machines under adversarial label contamination , 2015, Neurocomputing.

[190]  Anubhav Jain,et al.  Finding Nature′s Missing Ternary Oxide Compounds Using Machine Learning and Density Functional Theory. , 2010 .

[191]  Quanshi Zhang,et al.  Interpreting CNNs via Decision Trees , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[192]  Satoshi Hara,et al.  Making Tree Ensembles Interpretable: A Bayesian Model Selection Approach , 2016, AISTATS.

[193]  Helen F. Hastie,et al.  MIRIAM: a multimodal chat-based interface for autonomous systems , 2017, ICMI.

[194]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.

[195]  James B. Brown,et al.  Iterative random forests to discover predictive and stable high-order interactions , 2017, Proceedings of the National Academy of Sciences.

[196]  Andreu Català,et al.  Rule-Based Learning Systems for Support Vector Machines , 2006, Neural Processing Letters.

[197]  T. Kathirvalavakumar,et al.  Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems , 2011, Neural Processing Letters.

[198]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[199]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[200]  Jonas Lerman,et al.  Big Data and Its Exclusions , 2013 .

[201]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[202]  Ben Green Risk Assessments : A Precarious Approach for Criminal Justice Reform , 2018 .

[203]  Gabriel Synnaeve,et al.  A Bayesian model for opening prediction in RTS games with application to StarCraft , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[204]  Sven Behnke,et al.  Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[205]  Mark T. Keane,et al.  The Twin-System Approach as One Generic Solution for XAI: An Overview of ANN-CBR Twins for Explaining Deep Learning , 2019, IJCAI 2019.

[206]  Pedro M. Domingos Knowledge Discovery Via Multiple Models , 1998, Intell. Data Anal..

[207]  Filip Karlo Dosilovic,et al.  Explainable artificial intelligence: A survey , 2018, 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[208]  Trevor Hastie,et al.  Generalized linear and generalized additive models in studies of species distributions: setting the scene , 2002 .

[209]  Tatsuya Harada,et al.  Learning to Explain With Complemental Examples , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[210]  Chris Aldrich,et al.  ANN-DT: an algorithm for extraction of decision trees from artificial neural networks , 1999, IEEE Trans. Neural Networks.

[211]  D. West The Future of Work: Robots, AI, and Automation , 2018 .

[212]  Hany Farid,et al.  The accuracy, fairness, and limits of predicting recidivism , 2018, Science Advances.

[213]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[214]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[215]  Octavio Loyola-González,et al.  Black-Box vs. White-Box: Understanding Their Advantages and Weaknesses From a Practical Point of View , 2019, IEEE Access.

[216]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[217]  Barry Smyth,et al.  Similarity vs. Diversity , 2001, ICCBR.

[218]  K. Larsen,et al.  Interpreting Parameters in the Logistic Regression Model with Random Effects , 2000, Biometrics.

[219]  Arthur Szlam,et al.  Automatic Rule Extraction from Long Short Term Memory Networks , 2016, ICLR.

[220]  Hod Lipson,et al.  Convergent Learning: Do different neural networks learn the same representations? , 2015, FE@NIPS.

[221]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[222]  Chris Aldrich,et al.  Interpretation of nonlinear relationships between process variables by use of random forests , 2012 .

[223]  Mark A. Neerincx,et al.  Using Perceptual and Cognitive Explanations for Enhanced Human-Agent Team Performance , 2018, HCI.

[224]  Zoran Bursac,et al.  Purposeful selection of variables in logistic regression , 2008, Source Code for Biology and Medicine.

[225]  Daniel Neagu,et al.  Interpreting random forest classification models using a feature contribution method , 2013, IRI.

[226]  C. Y. Peng,et al.  An Introduction to Logistic Regression Analysis and Reporting , 2002 .

[227]  Paulo Cortez,et al.  Using sensitivity analysis and visualization techniques to open black box data mining models , 2013, Inf. Sci..

[228]  Evaggelia Pitoura,et al.  Diversity in Big Data: A Review , 2017, Big Data.

[229]  Marko Robnik-Sikonja,et al.  Explaining Classifications For Individual Instances , 2008, IEEE Transactions on Knowledge and Data Engineering.

[230]  H. Levent Akin,et al.  Rule extraction from trained neural networks using genetic algorithms , 1997 .

[231]  Yuhui Zheng,et al.  Recent Progress on Generative Adversarial Networks (GANs): A Survey , 2019, IEEE Access.

[232]  Bilwaj Gaonkar,et al.  Interpreting support vector machine models for multivariate group wise analysis in neuroimaging , 2015, Medical Image Anal..

[233]  Yiannis Demiris,et al.  Towards Explainable Shared Control using Augmented Reality , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[234]  Werner A. Stahel,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[235]  Marco Buongiorno Nardelli,et al.  The high-throughput highway to computational materials design. , 2013, Nature materials.

[236]  Davide Castelvecchi,et al.  Can we open the black box of AI? , 2016, Nature.

[237]  William J. Clancey,et al.  Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI , 2019, ArXiv.

[238]  Maarten de Rijke,et al.  Explaining Predictions from Tree-based Boosting Ensembles , 2019, ArXiv.

[239]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[240]  Andrew P. Bradley,et al.  Rule Extraction from Support Vector Machines: A Sequential Covering Approach , 2007, IEEE Transactions on Knowledge and Data Engineering.

[241]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[242]  Kush R. Varshney,et al.  On the Safety of Machine Learning: Cyber-Physical Systems, Decision Sciences, and Data Products , 2016, Big Data.

[243]  Fabio Roli,et al.  Is data clustering in adversarial settings secure? , 2013, AISec.

[244]  Daniel Rueckert,et al.  Learning Interpretable Anatomical Features Through Deep Generative Models: Application to Cardiac Remodeling , 2018, MICCAI.

[245]  John Q. Gan,et al.  Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling , 2008, Fuzzy Sets Syst..

[246]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[247]  John-Jules Ch. Meyer,et al.  Design and Evaluation of Explainable BDI Agents , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[248]  A James O'Malley,et al.  A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use , 2010, Statistical modelling.

[249]  Dan Alistarh,et al.  Model compression via distillation and quantization , 2018, ICLR.

[250]  Joydeep Ghosh,et al.  CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models , 2019, ArXiv.

[251]  Joao Marques-Silva,et al.  Learning Optimal Decision Trees with SAT , 2018, IJCAI.

[252]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[253]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[254]  P. Taylan,et al.  New approaches to regression by generalized additive models and continuous optimization for modern applications in finance, science and technology , 2007 .

[255]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[256]  Andreas Zell,et al.  Interpreting linear support vector machine models with heat map molecule coloring , 2011, J. Cheminformatics.

[257]  Yan Liu,et al.  Interpretable Deep Models for ICU Outcome Prediction , 2016, AMIA.

[258]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[259]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[260]  Maximilian Karl,et al.  Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.

[261]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[262]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[263]  Been Kim,et al.  iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction , 2015 .

[264]  Andreas Theodorou,et al.  Designing and implementing transparency for real time inspection of autonomous robots , 2017, Connect. Sci..

[265]  Mike Wu,et al.  Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[266]  Tak-Shing Harry So,et al.  The Use and Interpretation of Logistic Regression in Higher Education Journals: 1988–1999 , 2002 .

[267]  Panagiotis Papapetrou,et al.  A peek into the black box: exploring classifiers by randomization , 2014, Data Mining and Knowledge Discovery.

[268]  Hrvoje Gebavi,et al.  and Communication Technology, Electronics and Microelectronics (MIPRO) , 2015 .

[269]  Shiliang Sun,et al.  Multi-view learning overview: Recent progress and new challenges , 2017, Inf. Fusion.

[270]  Jiasen Lu,et al.  Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.

[271]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[272]  Artur S. d'Avila Garcez,et al.  Logic Tensor Networks for Semantic Image Interpretation , 2017, IJCAI.

[273]  H. Tsukimoto,et al.  Rule extraction from neural networks via decision tree induction , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[274]  H. A. Nefeslioglu,et al.  Assessment of Landslide Susceptibility by Decision Trees in the Metropolitan Area of Istanbul, Turkey , 2010 .

[275]  J. Ross Quinlan,et al.  Generating Production Rules from Decision Trees , 1987, IJCAI.

[276]  Stéphane Ducasse,et al.  Design and Implementation of a Backward-In-Time Debugger , 2006, NODe/GSEM.

[277]  Fan Zhang,et al.  coMobile: real-time human mobility modeling at urban scale using multi-view learning , 2015, SIGSPATIAL/GIS.

[278]  S. Imandoust,et al.  Application of K-Nearest Neighbor (KNN) Approach for Predicting Economic Events: Theoretical Background , 2013 .

[279]  Jenna Burrell,et al.  How the machine ‘thinks’: Understanding opacity in machine learning algorithms , 2016 .

[280]  Bernhard Schölkopf,et al.  Discovering Causal Signals in Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[281]  Lalana Kagal,et al.  Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models , 2016, ArXiv.

[282]  Jonathan E. Fieldsend,et al.  Confident Interpretation of Bayesian Decision Tree Ensembles for Clinical Applications , 2007, IEEE Transactions on Information Technology in Biomedicine.

[283]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[284]  Raja Chatila,et al.  Towards Explainable Neural-Symbolic Visual Reasoning , 2019, NeSy@IJCAI.

[285]  Nagiza F. Samatova,et al.  Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data , 2016, IEEE Transactions on Knowledge and Data Engineering.

[286]  Francisco Herrera,et al.  Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce , 2018, Inf. Fusion.

[288]  Adrian Weller,et al.  Challenges for Transparency , 2017, ArXiv.

[289]  Ivan Donadello,et al.  Semantic Image Interpretation - Integration of Numerical Data and Logical Knowledge for Cognitive Vision , 2018 .

[290]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[291]  Luciano Serafini,et al.  Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning , 2019, FLAP.

[292]  James S. Thorp,et al.  Decision trees for real-time transient stability prediction , 1994 .

[293]  L. Buydens,et al.  Visualisation and interpretation of Support Vector Regression models. , 2007, Analytica chimica acta.

[294]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[295]  Guy N. Rothblum,et al.  Fairness Through Computationally-Bounded Awareness , 2018, NeurIPS.

[296]  Johannes Gehrke,et al.  Accurate intelligible models with pairwise interactions , 2013, KDD.

[297]  Foster J. Provost,et al.  Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals , 2016, Big Data.

[298]  Anuj Karpatne,et al.  BHPMF – a hierarchical Bayesian approach to gap-filling and trait prediction for macroecology and functional biogeography , 2015 .

[299]  Yaxin Bi,et al.  An kNN Model-Based Approach and Its Application in Text Categorization , 2004, CICLing.

[300]  A. Hense,et al.  Probabilistic climate change predictions applying Bayesian model averaging , 2007, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[301]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[302]  Tatiana Levashova,et al.  Knowledge fusion patterns: A survey , 2019, Inf. Fusion.

[303]  Ryan P. Adams,et al.  Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.

[304]  Foster J. Provost,et al.  Explaining Classification Models Built on High-Dimensional Sparse Data , 2016, ArXiv.

[305]  Glenn Fung,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[306]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[307]  Michael Gleicher,et al.  A Framework for Considering Comprehensibility in Modeling , 2016, Big Data.

[308]  María José del Jesús,et al.  Evolutionary Fuzzy Systems for Explainable Artificial Intelligence: Why, When, What for, and Where to? , 2019, IEEE Computational Intelligence Magazine.

[309]  Peter Rothery,et al.  Application of generalized additive models to butterfly transect count data , 2001 .

[310]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[311]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[312]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[313]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[314]  Andrew Simpson,et al.  Self-Driving Car Steering Angle Prediction Based on Image Recognition , 2019, ArXiv.

[315]  Robert Chen,et al.  Machine Learning Model Interpretability for Precision Medicine , 2016 .

[316]  Andreu Català,et al.  Rule extraction from support vector machines , 2002, ESANN.

[317]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[318]  Helen F. Hastie,et al.  Explain Yourself: A Natural Language Interface for Scrutable Autonomous Robots , 2018, HRI 2018.

[319]  Francisco Charte,et al.  A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines , 2018, Inf. Fusion.

[320]  Stefan Wermter,et al.  Rule generation from neural networks for student assessment , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[321]  Sébastien Gambs,et al.  Fairwashing: the risk of rationalization , 2019, ICML.

[322]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[323]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[324]  Ruth M. J. Byrne,et al.  Counterfactuals in Explainable Artificial Intelligence (XAI): Evidence from Human Reasoning , 2019, IJCAI.

[325]  Ying Zhang,et al.  Rule Extraction from Trained Support Vector Machines , 2005, PAKDD.

[326]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[327]  G. Imbens,et al.  Machine Learning Methods for Estimating Heterogeneous Causal Eects , 2015 .

[328]  Murray Shanahan,et al.  Towards Deep Symbolic Reinforcement Learning , 2016, ArXiv.

[329]  Daniel W. Apley,et al.  Visualizing the effects of predictor variables in black box supervised learning models , 2016, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[330]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[331]  Paul Terry,et al.  Application of the GA/KNN method to SELDI proteomics data , 2004, Bioinform..

[332]  Ryuichi Matsukura,et al.  Application of a generalized additive model (GAM) to reveal relationships between environmental factors and distributions of pelagic fish and krill: a case study in Sendai Bay, Japan , 2009 .

[333]  Christian Biemann,et al.  What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.

[334]  Daniel Berg Bankruptcy Prediction by Generalized Additive Models , 2006 .

[335]  Emil Pitkin,et al.  Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation , 2013, 1309.6392.

[336]  C. Mood Logistic Regression: Why We Cannot Do What We Think We Can Do, and What We Can Do About It , 2010 .

[337]  Kenney Ng,et al.  Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models , 2016, CHI.

[338]  Amina Adadi,et al.  Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[339]  Raymond J. Mooney,et al.  Ensembling Visual Explanations , 2018 .

[340]  Chris Piech,et al.  Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction , 2018, ArXiv.

[341]  Federica Russo,et al.  Critical data studies: An introduction , 2016, Big Data Soc..

[342]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[343]  Laurence T. Yang,et al.  A survey on data fusion in internet of things: Towards secure and privacy-preserving fusion , 2019, Inf. Fusion.

[344]  Hiroshi Tsukimoto,et al.  Extracting rules from trained neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[345]  Michèle Sebag,et al.  Learning Functional Causal Models with Generative Neural Networks , 2018 .

[346]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[347]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[348]  Panagiotis G. Ipeirotis,et al.  Beat the Machine: Challenging Humans to Find a Predictive Model's “Unknown Unknowns” , 2015, JDIQ.

[349]  Paulo J. G. Lisboa,et al.  Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach , 2006, IEEE Transactions on Neural Networks.