Ensemble-based Semantic Lexicon Induction for Semantic Tagging

We present an ensemble-based framework for semantic lexicon induction that incorporates three diverse approaches for semantic class identification. Our architecture brings together previous bootstrapping methods for pattern-based semantic lexicon induction and contextual semantic tagging, and incorporates a novel approach for inducing semantic classes from coreference chains. The three methods are embedded in a bootstrapping architecture where they produce independent hypotheses, consensus words are added to the lexicon, and the process repeats. Our results show that the ensemble outperforms individual methods in terms of both lexicon quality and instance-based semantic tagging.

[1]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[2]  Ellen Riloff,et al.  Exploiting Strong Syntactic Heuristics and Co-Training to Learn Semantic Lexicons , 2002, EMNLP.

[3]  Eduard H. Hovy,et al.  Fine Grained Classification of Named Entities , 2002, COLING.

[4]  Brian Roark,et al.  Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction , 2000, COLING.

[5]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[6]  Marius Pasca,et al.  Acquisition of categorized named entities for web search , 2004, CIKM '04.

[7]  Ellen Riloff,et al.  A Corpus-Based Approach for Building Semantic Lexicons , 1997, EMNLP.

[8]  Vincent Ng,et al.  Semantic Class Induction and Coreference Resolution , 2007, ACL.

[9]  Tom M. Mitchell,et al.  Coupling Semi-Supervised Learning of Categories and Relations , 2009, HLT-NAACL 2009.

[10]  James R. Curran,et al.  Reducing Semantic Drift with Bagging and Distributional Similarity , 2009, ACL.

[11]  Claire Cardie,et al.  Coreference Resolution with Reconcile , 2010, ACL.

[12]  James R. Curran,et al.  Weighted Mutual Exclusion Bootstrapping for Domain Independent Lexicon and Template Acquisition , 2008, ALTA.

[13]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[14]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[15]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[16]  Ellen Riloff,et al.  A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts , 2002, EMNLP.

[17]  Tara McIntosh,et al.  Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping , 2010, EMNLP.

[18]  James Curran,et al.  Ensemble Methods for Automatic Thesaurus Extraction , 2002, EMNLP.

[19]  Patrick Pantel,et al.  Semi-Automatic Entity Set Refinement , 2009, NAACL.

[20]  Cheng Niu,et al.  A Bootstrapping Approach to Named Entity Classification Using Successive Learners , 2003, ACL.

[21]  Ellen Riloff,et al.  Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs , 2008, ACL.

[22]  Ellen Riloff,et al.  Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing , 2010, ACL.

[23]  Brian Roark,et al.  Noun-Phrase Co-Occurence Statistics for Semi-Automatic Semantic Lexicon Construction , 1998, COLING-ACL.

[24]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[25]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[26]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[27]  David Yarowsky,et al.  Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence , 1999, EMNLP.