Zero-Shot Open Entity Typing as Type-Compatible Grounding

The problem of entity-typing has been studied predominantly in supervised learning fashion, mostly with task-specific annotations (for coarse types) and sometimes with distant supervision (for fine types). While such approaches have strong performance within datasets, they often lack the flexibility to transfer across text genres and to generalize to new type taxonomies. In this work we propose a zero-shot entity typing approach that requires no annotated data and can flexibly identify newly defined types. Given a type taxonomy defined as Boolean functions of FREEBASE "types", we ground a given mention to a set of type-compatible Wikipedia entries and then infer the target mention's types using an inference algorithm that makes use of the types of these entries. We evaluate our system on a broad range of datasets, including standard fine-grained and coarse-grained entity typing datasets, and also a dataset in the biological domain. Our system is shown to be competitive with state-of-the-art supervised NER systems and outperforms them on out-of-domain datasets. We also show that our system significantly outperforms other zero-shot fine typing systems.

[1]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[2]  Oren Etzioni,et al.  No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities , 2012, EMNLP.

[3]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[4]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[5]  Dan Roth,et al.  On Dataless Hierarchical Text Classification , 2014, AAAI.

[6]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[7]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[8]  Stan Matwin,et al.  Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity , 2006, Canadian AI.

[9]  Heng Ji,et al.  Building a Fine-Grained Entity Typing System Overnight for a New X (X = Language, Domain, Genre) , 2016, ArXiv.

[10]  Fernando Llopis,et al.  Improving Question Answering Using Named Entity Recognition , 2005, NLDB.

[11]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[12]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[13]  Kentaro Inui,et al.  Neural Architectures for Fine-grained Entity Type Classification , 2016, EACL.

[14]  Ming-Wei Chang,et al.  Importance of Semantic Representation: Dataless Classification , 2008, AAAI.

[15]  Joel Nothman,et al.  Analysing Wikipedia and Gold-Standard Corpora for NER Training , 2009, EACL.

[16]  Dan Roth,et al.  Incidental Supervision: Moving beyond Supervised Learning , 2017, AAAI.

[17]  Fernando Pereira,et al.  Wikilinks: A Large-scale Cross-Document Coreference Corpus Labeled via Links to Wikipedia , 2012 .

[18]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[19]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[20]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[21]  Stephen D. Mayhew,et al.  CogCompNLP: Your Swiss Army Knife for NLP , 2018, LREC.

[22]  Joel Nothman,et al.  Transforming Wikipedia into Named Entity Training Data , 2008, ALTA.

[23]  Denilson Barbosa,et al.  Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss , 2018, NAACL.

[24]  Gerhard Weikum,et al.  HYENA: Hierarchical Type Classification for Entity Names , 2012, COLING.

[25]  Satoshi Sekine,et al.  Extended Named Entity Hierarchy , 2002, LREC.

[26]  Doug Downey,et al.  OTyper: A Neural Architecture for Open Named Entity Typing , 2018, AAAI.

[27]  Ido Dagan,et al.  Recognizing textual entailment: Rational, evaluation and approaches – Erratum , 2010, Natural Language Engineering.

[28]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[29]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[30]  Nevena Lazic,et al.  Embedding Methods for Fine Grained Entity Type Classification , 2015, ACL.

[31]  Dan Roth,et al.  Learning question classifiers: the role of semantic information , 2005, Natural Language Engineering.

[32]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[33]  Louise Deléger,et al.  Overview of the Bacteria Biotope Task at BioNLP Shared Task 2016 , 2016, BioNLP.

[34]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[35]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[36]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[37]  Daniel S. Weld,et al.  Fine-Grained Entity Recognition , 2012, AAAI.

[38]  Heng Ji,et al.  AFET: Automatic Fine-Grained Entity Typing by Hierarchical Partial-Label Embedding , 2016, EMNLP.

[39]  Ashish Anand,et al.  Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings , 2017, EACL.

[40]  Erik Cambria,et al.  Label Embedding for Zero-shot Fine-grained Named Entity Typing , 2016, COLING.

[41]  Omer Levy,et al.  Ultra-Fine Entity Typing , 2018, ACL.

[42]  Nevena Lazic,et al.  Context-Dependent Fine-Grained Entity Type Tagging , 2014, ArXiv.

[43]  Stephen D. Mayhew,et al.  Cross-Lingual Named Entity Recognition via Wikification , 2016, CoNLL.

[44]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[45]  Eduard H. Hovy,et al.  Fine Grained Classification of Named Entities , 2002, COLING.

[46]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[47]  Ido Dagan,et al.  Recognizing Textual Entailment: Models and Applications , 2013, Recognizing Textual Entailment: Models and Applications.