论文信息 - Logic Explained Networks

Logic Explained Networks

The large and still increasing popularity of deep learning clashes with a major limit of neural network architectures, that consists in their lack of capability in providing human-understandable motivations of their decisions. In situations in which the machine is expected to support the decision of human experts, providing a comprehensible explanation is a feature of crucial importance. The language used to communicate the explanations must be formal enough to be implementable in a machine and friendly enough to be understandable by a wide audience. In this paper, we propose a general approach to Explainable Artificial Intelligence in the case of neural architectures, showing how a mindful design of the networks leads to a family of interpretable deep learning models called Logic Explained Networks (LENs). LENs only require their inputs to be human-understandable predicates, and they provide explanations in terms of simple First-Order Logic (FOL) formulas involving such predicates. LENs are general enough to cover a large number of scenarios. Amongst them, we consider the case in which LENs are directly used as special classifiers with the capability of being explainable, or when they act as additional networks with the role of creating the conditions for making a black-box classifier explainable by FOL formulas. Despite supervised learning problems are mostly emphasized, we also show that LENs can learn and provide explanations in unsupervised learning settings. Experimental results on several datasets and tasks show that LENs may yield better classifications than established white-box models, such as decision trees and Bayesian rule lists, while providing more compact and meaningful explanations.

[1] Jasbir S. Arora,et al. Survey of multi-objective optimization methods for engineering , 2004 .

[2] Aaron C. Courville,et al. Understanding Representations Learned in Deep Architectures , 2010 .

[3] Richard Hans Robert Hahnloser,et al. Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit , 2000, Nature.

[4] H. Tsukimoto,et al. Rule extraction from neural networks via decision tree induction , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[5] Pietro Barbiero,et al. PyTorch, Explain! A Python library for Logic Explained Networks , 2021, 2105.11697.

[6] Jeffrey M. Hausdorff,et al. Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[7] M. Saeed,et al. Multiparameter Intelligent Monitoring in Intensive Care Ii (Mimic-Ii): A Public-Access Intensive Care Unit Database , 2011 .

[8] Hod Lipson,et al. Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[9] Julia E. Vogt,et al. Interpretability and Explainability: A Machine Learning Zoo Mini-tour , 2020, ArXiv.

[10] Matthias Hein,et al. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[11] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[12] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[13] Carlos Guestrin,et al. Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[14] Quoc V. Le,et al. Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[16] Amina Adadi,et al. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[17] H. Simon,et al. Rational choice and the structure of the environment. , 1956, Psychological review.

[18] C. Rudin,et al. Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges , 2021, Statistics Surveys.

[19] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Marco Maggini,et al. A Constraint-Based Approach to Learning and Explanation , 2020, AAAI.

[22] Ajay Chander,et al. Explanation Perspectives from the Cognitive Sciences - A Survey , 2020, IJCAI.

[23] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[24] Margo I. Seltzer,et al. Learning Certifiably Optimal Rule Lists , 2017, KDD.

[25] Fabio Roli,et al. Domain Knowledge Alleviates Adversarial Attacks in Multi-Label Classifiers. , 2021, IEEE transactions on pattern analysis and machine intelligence.

[26] Been Kim,et al. Concept Bottleneck Models , 2020, ICML.

[27] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[28] G. A. Miller. THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[29] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[30] George Kesidis,et al. Adversarial Learning Targeting Deep Neural Network Classification: A Comprehensive Review of Defenses Against Attacks , 2020, Proceedings of the IEEE.

[31] Jaime S. Cardoso,et al. Machine Learning Interpretability: A Survey on Methods and Metrics , 2019, Electronics.

[32] Andrew Gordon Wilson,et al. The Case for Bayesian Deep Learning , 2020, ArXiv.

[33] T. H. Allegri. The Code of Federal Regulations , 1986 .

[34] R. Tibshirani,et al. Generalized Additive Models: Some Applications , 1987 .

[35] James Zou,et al. Towards Automatic Concept-based Explanations , 2019, NeurIPS.

[36] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[37] Jun Wang,et al. Working with Beliefs: AI Transparency in the Enterprise , 2018, IUI Workshops.

[38] Marcello Sanguineti,et al. Foundations of Support Constraint Machines , 2015, Neural Computation.

[39] Geoffrey E. Hinton,et al. Neural Additive Models: Interpretable Machine Learning with Neural Nets , 2020, NeurIPS.

[40] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[41] Peter Henderson,et al. Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims , 2020, ArXiv.

[42] Nripendra N. Biswas,et al. Minimization of Boolean Functions , 1971, IEEE Transactions on Computers.

[43] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[44] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[45] Ankur Teredesai,et al. Interpretable Machine Learning in Healthcare , 2018, 2018 IEEE International Conference on Healthcare Informatics (ICHI).

[46] H. Simon,et al. Rational Decision Making in Business Organizations , 1978 .

[47] F. Roli,et al. Can Domain Knowledge Alleviate Adversarial Attacks in Multi-Label Classifiers? , 2020, arXiv.org.

[48] J. Ross Quinlan,et al. Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[49] Willard Van Orman Quine,et al. The Problem of Simplifying Truth Functions , 1952 .

[50] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[51] Hugh McColl,et al. The Calculus of Equivalent Statements (Third Paper) , 1878 .

[52] Cynthia Rudin,et al. Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[53] Pietro Liò,et al. Now You See Me (CME): Concept-based Model Extraction , 2020, CIKM.

[54] Wojciech Samek,et al. Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond , 2020, ArXiv.

[55] Mesut Ozdag,et al. Adversarial Attacks and Defenses Against Deep Neural Networks: A Survey , 2018 .

[56] M. Goddard. The EU General Data Protection Regulation (GDPR): European Regulation that has a Global Impact , 2017 .

[57] Brandon M. Greenwell,et al. Interpretable Machine Learning , 2019, Hands-On Machine Learning with R.

[58] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[59] Marco Gori,et al. Unsupervised Learning by Minimal Entropy Encoding , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[60] Cynthia Rudin,et al. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[61] W. Ma,et al. Changing concepts of working memory , 2014, Nature Neuroscience.

[62] Arun Das,et al. Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey , 2020, ArXiv.

[63] Martin Wattenberg,et al. TCAV: Relative concept importance testing with Linear Concept Activation Vectors , 2018 .

[64] P. Battaglia,et al. Learning Symbolic Physics with Graph Networks , 2019, ArXiv.

[65] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.

[66] N. Cowan. The magical number 4 in short-term memory: A reconsideration of mental storage capacity , 2001, Behavioral and Brain Sciences.

[67] Kelly M. McMann,et al. V-Dem Codebook V10 , 2020 .

[68] Johannes Gehrke,et al. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[69] Lalana Kagal,et al. Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[70] Eneldo Loza Mencía,et al. DeepRED - Rule Extraction from Deep Neural Networks , 2016, DS.

[71] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[72] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[73] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[74] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[75] Been Kim,et al. Considerations for Evaluation and Generalization in Interpretable Machine Learning , 2018 .

[76] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[77] Tim Miller,et al. Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[78] Understanding Boolean Function Learnability on Deep Neural Networks , 2020, ArXiv.

[79] Daniel Cremers,et al. Regularization for Deep Learning: A Taxonomy , 2017, ArXiv.

[80] Kyle L. Marquardt,et al. The V–Dem Measurement Model: Latent Variable Analysis for Cross-National and Cross-Temporal Expert-Coded Data , 2015, SSRN Electronic Journal.

[81] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[82] Franco Turini,et al. Local Rule-Based Explanations of Black Box Decision Systems , 2018, ArXiv.

[83] Carlos Guestrin,et al. Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[84] Marco Maggini,et al. Human-Driven FOL Explanations of Deep Learning , 2020, IJCAI.