论文信息 - Uncovering Semantic Bias in Neural Network Models Using a Knowledge Graph - 字舞流文

Uncovering Semantic Bias in Neural Network Models Using a Knowledge Graph

While neural networks models have shown impressive performance in many NLP tasks, lack of interpretability is often seen as a disadvantage. Individual relevance scores assigned by post-hoc explanation methods are not sufficient to show deeper systematic preferences and potential biases of the model that apply consistently across examples. In this paper we apply rule mining using knowledge graphs in combination with neural network explanation methods to uncover such systematic preferences of trained neural models and capture them in the form of conjunctive rules. We test our approach in the context of text classification tasks and show that such rules are able to explain a substantial part of the model behaviour as well as indicate potential causes of misclassifications when the model is applied outside of the initial training context.

Andriy Nikolov | Mathieu d'Aquin | M. d’Aquin | A. Nikolov

[1] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[2] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[3] Heikki Mannila,et al. Pruning and grouping of discovered association rules , 1995 .

[4] Roberto J. Bayardo,et al. Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[5] Nicolas Pasquier,et al. Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[6] Ulrich Güntzer,et al. Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[7] Abdelaziz Berrado,et al. Using metarules to organize and group discovered association rules , 2006, Data Mining and Knowledge Discovery.

[8] Fabrice Guillet,et al. Post-Processing of Discovered Association Rules Using Ontologies , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[9] Patrick Meyer,et al. On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid , 2008, Eur. J. Oper. Res..

[10] Jigui Sun,et al. Post-processing of associative classification rules using closed sets , 2009, Expert Syst. Appl..

[11] Luc De Raedt,et al. ILP turns 20 , 2011, Machine Learning.

[12] Katsumi Inoue,et al. ILP turns 20 - Biography and future challenges , 2012, Mach. Learn..

[13] Fabian M. Suchanek,et al. AMIE: association rule mining under incomplete evidence in ontological knowledge bases , 2013, WWW.

[14] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[15] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[16] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[17] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[18] Fabian M. Suchanek,et al. Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[19] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20] Daniel Jurafsky,et al. Understanding Neural Networks through Representation Erasure , 2016, ArXiv.

[21] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[22] Jens Lehmann,et al. DL-Learner - A framework for inductive learning on the Semantic Web , 2016, J. Web Semant..

[23] Nemanja Spasojevic,et al. Actionable and Political Text Classification using Word Embeddings and LSTM , 2016, ArXiv.

[24] Agnieszka Lawrynowicz,et al. Swift Linked Data Miner: Mining OWL 2 EL class expressions directly from online RDF datasets , 2017, J. Web Semant..

[25] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[26] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[27] Klaus-Robert Müller,et al. Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[28] Grzegorz Chrupala,et al. Representation of Linguistic Form and Function in Recurrent Neural Networks , 2016, CL.

[29] Klaus-Robert Müller,et al. Explainable artificial intelligence , 2017 .

[30] Klaus-Robert Müller,et al. Explaining Recurrent Neural Network Predictions in Sentiment Analysis , 2017, WASSA@EMNLP.

[31] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.

[32] Hinrich Schütze,et al. Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement , 2018, ACL.

[33] Martin Wattenberg,et al. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[34] Shi Feng,et al. Pathologies of Neural Models Make Interpretations Difficult , 2018, EMNLP.

[35] Yoav Goldberg,et al. Understanding Convolutional Neural Networks for Text Classification , 2018, BlackboxNLP@EMNLP.

[36] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[37] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[38] Cengiz Öztireli,et al. Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[39] Noah A. Smith,et al. Is Attention Interpretable? , 2019, ACL.

[40] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.

[41] Klaus-Robert Müller,et al. Towards Explainable Artificial Intelligence , 2019, Explainable AI.

[42] James Zou,et al. Towards Automatic Concept-based Explanations , 2019, NeurIPS.

[43] Alexander Binder,et al. Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[44] Daniel S. Weld,et al. The challenge of crafting intelligible intelligence , 2018, Commun. ACM.

[45] Klaus-Robert Müller,et al. Evaluating Recurrent Neural Network Explanations , 2019, BlackboxNLP@ACL.

[46] Freddy Lécué,et al. On The Role of Knowledge Graphs in Explainable AI , 2020, PROFILES/SEMEX@ISWC.

[47] Helmut Krcmar,et al. Semantic Web Technologies for Explainable Machine Learning Models: A Literature Review , 2019, PROFILES/SEMEX@ISWC.

[48] Abubakar Abid,et al. Interpretation of Neural Networks is Fragile , 2017, AAAI.

[49] Pascal Hitzler,et al. Neural-symbolic integration and the Semantic Web , 2020, Semantic Web.