Text-Based Joint Prediction of Numeric and Categorical Attributes of Entities in Knowledge Bases

Collaboratively constructed knowledge bases play an important role in information systems, but are essentially always incomplete. Thus, a large number of models has been developed for Knowledge Base Completion, the task of predicting new attributes of entities given partial descriptions of these entities. Virtually all of these models either concentrate on numeric attributes ( ) or they concentrate on categorical attributes ( ). In this paper, we propose a simple feed-forward neural architecture to jointly predict numeric and categorical attributes based on embeddings learned from textual occurrences of the entities in question. Following insights from multi-task learning, our hypothesis is that due to the correlations among attributes of different kinds, joint prediction improves over separate prediction. Our experiments on seven FreeBase domains show that this hypothesis is true of the two attribute types: we find substantial improvements for numeric attributes in the joint model, while performance remains largely unchanged for categorical attributes. Our analysis indicates that this is the case because categorical attributes, many of which describe membership in various classes, provide useful ‘background knowledge’ for numeric prediction, while this is true to a lesser degree in the inverse direction.

[1]  Ralph Grishman,et al.  Distant Supervision for Relation Extraction with an Incomplete Knowledge Base , 2013, NAACL.

[2]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[3]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[4]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[5]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[6]  John Miller,et al.  Traversing Knowledge Graphs in Vector Space , 2015, EMNLP.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[9]  Rahul Gupta,et al.  Knowledge base completion via search-based question answering , 2014, WWW.

[10]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[11]  Simone Paolo Ponzetto,et al.  Collaboratively built semi-structured content and Artificial Intelligence: The story so far , 2013, Artif. Intell..

[12]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[13]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[14]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[15]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[16]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[17]  Heike Adel,et al.  Corpus-level Fine-grained Entity Typing , 2017, J. Artif. Intell. Res..

[18]  Gemma Boleda,et al.  Distributed Prediction of Relations for Entities: The Easy, The Difficult, and The Impossible , 2017, *SEM.

[19]  Sebastian Rudolph,et al.  Foundations of Semantic Web Technologies , 2009 .

[20]  Gemma Boleda,et al.  Distributional vectors encode referential attributes , 2015, EMNLP.

[21]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[22]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[23]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[24]  Tom M. Mitchell,et al.  Learning a Compositional Semantics for Freebase with an Open Predicate Vocabulary , 2015, TACL.

[25]  Ari Rappoport,et al.  Extraction and Approximation of Numerical Attributes from the Web , 2010, ACL.

[26]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[27]  Victor S. Lempitsky,et al.  Efficient Indexing of Billion-Scale Datasets of Deep Descriptors , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..