Modeling Fine-Grained Entity Types with Box Embeddings

Neural entity typing models typically represent entity types as vectors in a highdimensional space, but such spaces are not well-suited to modeling these types’ complex interdependencies. We study the ability of box embeddings, which represent entity types as ddimensional hyperrectangles, to represent hierarchies of fine-grained entity type labels even when these relationships are not defined explicitly in the ontology. Our model represents both types and entity mentions as boxes. Each mention and its context are fed into a BERT-based model to embed that mention in our box space; essentially, this model leverages typological clues present in the surface text to hypothesize a type representation for the mention. Soft box containment can then be used to derive probabilities, both the posterior probability of a mention exhibiting a given type and the conditional probability relations between types themselves. We compare our approach with a strong vector-based typing model, and observe state-of-the-art performance on several entity typing benchmarks. In addition to competitive typing performance, our box-based model shows better performance in prediction consistency (predicting a supertype and a subtype together) and confidence (i.e., calibration), implying that the box-based model captures the latent type hierarchies better than the vectorbased model does.

[1]  Geoffrey E. Hinton,et al.  Learning distributed representations of concepts. , 1989 .

[2]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[3]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[4]  Benjamin Van Durme,et al.  Hierarchical Entity Typing via Multi-level Learning to Rank , 2020, ACL.

[5]  Ameet Talwalkar,et al.  Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization , 2016, ICLR.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Denilson Barbosa,et al.  Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss , 2018, NAACL.

[9]  Brendan T. O'Connor,et al.  Posterior calibration and exploratory analysis for natural language processing models , 2015, EMNLP.

[10]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[11]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[12]  Karl Stratos,et al.  EntEval: A Holistic Evaluation Benchmark for Entity Representations , 2019, EMNLP/IJCNLP.

[13]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[14]  Michael Boratko,et al.  Improving Local Identifiability in Probabilistic Box Embeddings , 2020, NeurIPS.

[15]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[16]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[17]  Nevena Lazic,et al.  Context-Dependent Fine-Grained Entity Type Tagging , 2014, ArXiv.

[18]  Sanja Fidler,et al.  Order-Embeddings of Images and Language , 2015, ICLR.

[19]  Katrin Erk,et al.  Representing words as regions in vector space , 2009, CoNLL.

[20]  Heng Ji,et al.  Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding , 2016, KDD.

[21]  Ying Lin,et al.  An Attentive Fine-Grained Entity Typing Model with Latent Type Representation , 2019, EMNLP.

[22]  Dan Klein,et al.  A Joint Model for Entity Analysis: Coreference, Typing, and Linking , 2014, TACL.

[23]  Heng Ji,et al.  AFET: Automatic Fine-Grained Entity Typing by Hierarchical Partial-Label Embedding , 2016, EMNLP.

[24]  Andrew McCallum,et al.  Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking , 2018, ACL.

[25]  Chenguang Zhu,et al.  Injecting Entity Types into Entity-Guided Text Generation , 2021, EMNLP.

[26]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[27]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[28]  Andrew McCallum,et al.  Word Representations via Gaussian Embedding , 2014, ICLR.

[29]  Kentaro Inui,et al.  Neural Architectures for Fine-grained Entity Type Classification , 2016, EACL.

[30]  Dan Roth,et al.  Entity Linking via Joint Encoding of Types, Descriptions, and Context , 2017, EMNLP.

[31]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[32]  Thomas Hofmann,et al.  Hyperbolic Entailment Cones for Learning Hierarchical Embeddings , 2018, ICML.

[33]  Nevena Lazic,et al.  Embedding Methods for Fine Grained Entity Type Classification , 2015, ACL.

[34]  Yasumasa Onoe,et al.  Fine-Grained Entity Typing for Domain Independent Entity Linking , 2020, AAAI.

[35]  Jure Leskovec,et al.  Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings , 2020, ICLR.

[36]  Katrin Erk Supporting inferences in semantic space: representing words as regions , 2009, IWCS.

[37]  Sopan Khosla,et al.  Using Type Information to Improve Entity Coreference Resolution , 2020, CODI.

[38]  Daniel S. Weld,et al.  Fine-Grained Entity Recognition , 2012, AAAI.

[39]  Yasumasa Onoe,et al.  Learning to Denoise Distantly-Labeled Data for Entity Typing , 2019, NAACL.

[40]  Yasumasa Onoe,et al.  Interpretable Entity Representations through Large-Scale Typing , 2020, EMNLP.

[41]  Sheng Zhang,et al.  Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds , 2018, *SEMEVAL.

[42]  Xiang Li,et al.  Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures , 2018, ACL.

[43]  Shrey Desai,et al.  Calibration of Pre-trained Transformers , 2020, EMNLP.

[44]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[45]  Hong Chen,et al.  PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution , 2018, EMNLP.

[46]  Mo Yu,et al.  Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing , 2019, NAACL.

[47]  Xiang Li,et al.  Smoothing the Geometry of Probabilistic Box Embeddings , 2018, ICLR.

[48]  Thomas Lukasiewicz,et al.  BoxE: A Box Embedding Model for Knowledge Base Completion , 2020, NeurIPS.

[49]  Michael Strube,et al.  A Fully Hyperbolic Neural Model for Hierarchical Multi-class Classification , 2020, FINDINGS.

[50]  Omer Levy,et al.  Ultra-Fine Entity Typing , 2018, ACL.

[51]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[52]  Alice Lai,et al.  Learning to Predict Denotational Probabilities For Modeling Entailment , 2017, EACL.

[53]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.