Semi-Supervised Bootstrapping of Relationship Extractors with Distributional Semantics

Semi-supervised bootstrapping techniques for relationship extraction from text iteratively expand a set of initial seed relationships while limiting the semantic drift. We research bootstrapping for relationship extraction using word embeddings to find similar relationships. Experimental results show that relying on word embeddings achieves a better performance on the task of extracting four types of relationships from a collection of newswire documents when compared with a baseline using TFIDF to find similar relationships.

[1]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[2]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[3]  Mo Yu Factor-based Compositional Embedding Models , 2014 .

[4]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[5]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[6]  SaltonGerard,et al.  Term-weighting approaches in automatic text retrieval , 1988 .

[7]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[8]  Hannah Bast,et al.  Easy access to the freebase dataset , 2014, WWW.

[9]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[10]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[11]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[12]  Hong Yu,et al.  Extracting synonymous gene and protein terms from biological literature , 2003, ISMB.

[13]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[14]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Ryan Gabbard,et al.  Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters , 2011, ACL.

[17]  Paolo Merialdo,et al.  Automatic Evaluation of Relation Extraction Systems on Large-scale , 2012, AKBC-WEKEX@NAACL-HLT.

[18]  Dan Klein,et al.  A Joint Model for Entity Analysis: Coreference, Typing, and Linking , 2014, TACL.

[19]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.