Explainable zero-shot learning via attentive graph convolutional network and knowledge graphs

Zero-shot learning (ZSL) which aims to deal with new classes that have never appeared in the training data (i.e., unseen classes) has attracted massive research interests recently. Transferring of deep features learned from training classes (i.e., seen classes) are often used, but most current methods are black-box models without any explanations, especially textual explanations that are more acceptable to not only machine learning specialists but also common people without artificial intelligence expertise. In this paper, we focus on explainable ZSL, and present a knowledge graph (KG) based framework that can explain the transferability of features in ZSL in a human understandable manner. The framework has two modules: an attentive ZSL learner and an explanation generator. The former utilizes an Attentive Graph Convolutional Network (AGCN) to match class knowledge from WordNet with deep features learned from CNNs (i.e., encode inter-class relationship to predict classifiers), in which the features of unseen classes are transferred from seen classes to predict the samples of unseen classes, with impressive (important) seen classes detected, while the latter generates human understandable explanations for the transferability of features with class knowledge that are enriched by external KGs, including a domain-specific Attribute Graph and DBpedia. We evaluate our method on two benchmarks of animal recognition. Augmented by class knowledge from KGs, our framework generates promising explanations for the transferability of features, and at the same time improves the recognition accuracy.

[1]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[2]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[3]  Bernt Schiele,et al.  Latent Embeddings for Zero-Shot Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[5]  Wei-Lun Chao,et al.  Synthesized Classifiers for Zero-Shot Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Dongyan Zhao,et al.  Jointly Learning Entity and Relation Representations for Entity Alignment , 2019, EMNLP.

[7]  Yike Guo,et al.  Integrating Semantic Knowledge to Tackle Zero-shot Text Classification , 2019, NAACL.

[8]  David Ratcliffe,et al.  Closed-World Concept Induction for Learning in OWL Knowledge Bases , 2014, EKAW.

[9]  Tao Mei,et al.  Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions , 2018, EMNLP.

[10]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[11]  Tao Xiang,et al.  Learning a Deep Embedding Model for Zero-Shot Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Nuno Vasconcelos,et al.  Semantically Consistent Regularization for Zero-Shot Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yuzhong Qu,et al.  Multi-view Knowledge Graph Embedding for Entity Alignment , 2019, IJCAI.

[14]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[15]  Kathleen McKeown,et al.  Human-Centric Justification of Machine Learning Predictions , 2017, IJCAI.

[16]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[17]  Ramprasaath R. Selvaraju,et al.  Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance , 2018, ECCV.

[18]  Hao Wang,et al.  Rethinking Knowledge Graph Propagation for Zero-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[20]  Huajun Chen,et al.  Generative Adversarial Zero-shot Learning via Knowledge Graphs , 2020, ArXiv.

[21]  Percy Liang,et al.  Zero-shot Entity Extraction from Web Pages , 2014, ACL.

[22]  Huajun Chen,et al.  Ontology-guided Semantic Composition for Zero-Shot Learning , 2020, KR.

[23]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[24]  Huajun Chen,et al.  Zero-shot Text Classification via Reinforced Self-training , 2020, ACL.

[25]  Qiang Yang,et al.  Understanding How Feature Structure Transfers in Transfer Learning , 2017, IJCAI.

[26]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[27]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[28]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[29]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  John G. Breslin,et al.  Transfer Learning for Item Recommendations and Knowledge Graph Completion in Item Related Domains via a Co-Factorization Model , 2018, ESWC.

[32]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Ming-Wei Chang,et al.  Zero-Shot Entity Linking by Reading Entity Descriptions , 2019, ACL.

[34]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[35]  William Yang Wang,et al.  Towards Explainable NLP: A Generative Explanation Framework for Text Classification , 2018, ACL.

[36]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Maria Pershina,et al.  Holistic entity matching across knowledge graphs , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[38]  Cynthia Rudin,et al.  Learning theory analysis for association rules and sequential event prediction , 2013, J. Mach. Learn. Res..

[39]  Cordelia Schmid,et al.  Label-Embedding for Attribute-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[41]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[42]  이현주 Q. , 2005 .

[43]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[44]  Chunyun Zhang,et al.  Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs , 2020, AAAI.

[45]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[46]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Muktabh Mayank Srivastava,et al.  Train Once, Test Anywhere: Zero-Shot Learning for Text Classification , 2017, ArXiv.

[48]  Xi Chen,et al.  Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks , 2019, NAACL.

[49]  Enrico Motta,et al.  Dedalo: Looking for Clusters Explanations in a Labyrinth of Linked Data , 2014, ESWC.

[50]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[51]  Huajun Chen,et al.  Human-centric Transfer Learning Explanation via Knowledge Graph [Extended Abstract] , 2019, ArXiv.

[52]  Garrison W. Cottrell,et al.  A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction , 2017, IJCAI.

[53]  Abhinav Gupta,et al.  Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Weng-Keen Wong,et al.  Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.

[55]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[56]  Heiko Paulheim,et al.  Make Embeddings Semantic Again! , 2018, SEMWEB.

[57]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[58]  Or Biran,et al.  Explanation and Justification in Machine Learning : A Survey Or , 2017 .

[59]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[60]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[61]  Yu-Chiang Frank Wang,et al.  Multi-label Zero-Shot Learning with Structured Knowledge Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[62]  Daniel S. Weld,et al.  The challenge of crafting intelligible intelligence , 2018, Commun. ACM.

[63]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[64]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[65]  James Hays,et al.  SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[67]  Huajun Chen,et al.  Augmenting Transfer Learning with Semantic Reasoning , 2019, IJCAI.

[68]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[69]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[70]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[71]  Christoph H. Lampert,et al.  Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Jaime G. Carbonell,et al.  Zero-shot Neural Transfer for Cross-lingual Entity Linking , 2018, AAAI.

[74]  Yoshua Bengio,et al.  Zero-data Learning of New Tasks , 2008, AAAI.

[75]  Sören Auer,et al.  LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data , 2011, IJCAI.

[76]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[77]  Melanie Mitchell,et al.  Interpreting individual classifications of hierarchical networks , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[78]  Huajun Chen,et al.  Knowledge-based Transfer Learning Explanation , 2018, KR.

[79]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[80]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[81]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.