Relation Guided Bootstrapping of Semantic Lexicons

State-of-the-art bootstrapping systems rely on expert-crafted semantic constraints such as negative categories to reduce semantic drift. Unfortunately, their use introduces a substantial amount of supervised knowledge. We present the Relation Guided Bootstrapping (RGB) algorithm, which simultaneously extracts lexicons and open relationships to guide lexicon growth and reduce semantic drift. This removes the necessity for manually crafting category and relationship constraints, and manually generating negative categories.

[1]  James R. Curran,et al.  Weighted Mutual Exclusion Bootstrapping for Domain Independent Lexicon and Template Acquisition , 2008, ALTA.

[2]  Ralph Grishman,et al.  Unsupervised Learning of Generalized Names , 2002, COLING.

[3]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[4]  Oren Etzioni,et al.  Semantic Role Labeling for Open Information Extraction , 2010, HLT-NAACL 2010.

[5]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[6]  Mark Stevenson,et al.  Automatically acquiring a linguistically motivated genic interaction extraction system , 2005, ICML 2005.

[7]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[8]  Tara McIntosh,et al.  Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping , 2010, EMNLP.

[9]  James R. Curran,et al.  Reducing Semantic Drift with Bagging and Distributional Similarity , 2009, ACL.

[10]  Ellen Riloff,et al.  A Corpus-Based Approach for Building Semantic Lexicons , 1997, EMNLP.

[11]  Stephen Clark,et al.  Porting a lexicalized-grammar parser to the biomedical domain , 2009, J. Biomed. Informatics.

[12]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[13]  Claire Grover,et al.  Tools to Address the Interdependence between Tokenisation and Standoff Annotation , 2006, NLPXML@EACL.

[14]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[15]  J. Curran,et al.  Minimising semantic drift with Mutual Exclusion Bootstrapping , 2007 .