Generating Explainable Abstractions for Wikidata Entities

The large coverage and quality of the Wikidata knowledge graph make it suitable for usage in downstream applications, such as entity summarization, entity linking, and question answering. Yet, most retrieval and similarity-based methods for Wikidata make limited use of its semantics, and lose the link between the rich structure in Wikidata and the decision-making algorithm. In this paper, we investigate how to define abstractive representations (profiles) of Wikidata entities. We propose a scalable method that can produce profiles for Wikidata entities based on salient labels associated with their types. We represent the resulting profiles as a graph, and compute profile embeddings. Our empirical analysis shows that the profiles can capture similarity competitively to baselines, but excel in terms of explainability. On the task of neural entity linking in tables, the profiles outperform all baselines in terms of accuracy, whereas their human-readable representation clearly explains the source of improvement. We make our code and data available to facilitate novel use cases based on the Wikidata profiles.

[1]  Juan-Zi Li,et al.  Text-Enhanced Representation Learning for Knowledge Graph , 2016, IJCAI.

[2]  M. de Rijke,et al.  A Corpus for Entity Profiling in Microblog Posts , 2012 .

[3]  Dominique Ritze,et al.  A Large Public Corpus of Web Tables containing Time and Context Metadata , 2016, WWW.

[4]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[5]  Antonin Delpeuch,et al.  OpenTapioca: Lightweight Entity Linking for Wikidata , 2019, Wikidata@ISWC.

[6]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[7]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[8]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[9]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[10]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[11]  Oshin Agarwal,et al.  Large Scale Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training , 2020, ArXiv.

[12]  Mong-Li Lee,et al.  Entity profiling with varying source reliabilities , 2014, KDD.

[13]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[14]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[15]  Xiang Zhang,et al.  Entity Profiling in Knowledge Graphs , 2020, IEEE Access.

[16]  Yuzhong Qu,et al.  Entity Summarization: State of the Art and Future Challenges , 2021, J. Web Semant..

[17]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[18]  Piek T. J. M. Vossen,et al.  The role of knowledge in determining identity of long-tail entities , 2020, J. Web Semant..

[19]  Jiaoyan Chen,et al.  Results of SemTab 2020 , 2020, SemTab@ISWC.

[20]  Gerhard Weikum,et al.  YAGO 4: A Reason-able Knowledge Base , 2020, ESWC.

[21]  Huanbo Luan,et al.  Modeling Relation Paths for Representation Learning of Knowledge Bases , 2015, EMNLP.

[22]  Zhiyuan Liu,et al.  Representation Learning of Knowledge Graphs with Hierarchical Types , 2016, IJCAI.

[23]  Jeffrey Nichols,et al.  Where Is This Tweet From? Inferring Home Locations of Twitter Users , 2012, ICWSM.

[24]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[25]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[26]  Chenguang Zhu,et al.  Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization , 2020, ArXiv.

[27]  Will Radford,et al.  Learning to generate one-sentence biographies from Wikidata , 2017, EACL.

[28]  Pedro Szekely,et al.  Commonsense Knowledge in Wikidata , 2020, Wikidata@ISWC.

[29]  David Jurgens,et al.  That's What Friends Are For: Inferring Location in Online Social Media Platforms Based on Social Relationships , 2013, ICWSM.

[30]  William Yang Wang,et al.  KBGAN: Adversarial Learning for Knowledge Graph Embeddings , 2017, NAACL.