Human-Driven FOL Explanations of Deep Learning

Deep neural networks are usually considered blackboxes due to their complex internal architecture, that cannot straightforwardly provide humanunderstandable explanations on how they behave. Indeed, Deep Learning is still viewed with skepticism in those real-world domains in which incorrect predictions may produce critical effects. This is one of the reasons why in the last few years Explainable Artificial Intelligence (XAI) techniques have gained a lot of attention in the scientific community. In this paper, we focus on the case of multilabel classification, proposing a neural network that learns the relationships among the predictors associated to each class, yielding First-Order Logic (FOL)-based descriptions. Both the explanationrelated network and the classification-related network are jointly learned, thus implicitly introducing a latent dependency between the development of the explanation mechanism and the development of the classifiers. Our model can integrate human-driven preferences that guide the learningto-explain process, and it is presented in a unified framework. Different typologies of explanations are evaluated in distinct experiments, showing that the proposed approach discovers new knowledge and can improve the classifier performance.

[1]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[2]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[3]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[4]  R. Kennedy,et al.  Defense Advanced Research Projects Agency (DARPA). Change 1 , 1996 .

[5]  Marco Gori,et al.  Cognitive Action Laws: The Case of Visual Features , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[7]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[8]  Benoît Frénay,et al.  Interpretability of machine learning models and representations: an introduction , 2016, ESANN.

[9]  Peter Tino,et al.  IEEE Transactions on Neural Networks , 2009 .

[10]  Jaime S. Cardoso,et al.  Machine Learning Interpretability: A Survey on Methods and Metrics , 2019, Electronics.

[11]  Marco Gori,et al.  Semi-supervised Learning with Constraints for Multi-view Object Recognition , 2009, ICANN.

[12]  Bart Baesens,et al.  An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[13]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[14]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[15]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[16]  Filip Karlo Dosilovic,et al.  Explainable artificial intelligence: A survey , 2018, 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).