Structured Minimally Supervised Learning for Neural Relation Extraction

We present an approach to minimally supervised relation extraction that combines the benefits of learned representations and structured learning, and accurately predicts sentence-level relation mentions given only proposition-level supervision from a KB. By explicitly reasoning about missing data during learning, our approach enables large-scale training of 1D convolutional neural networks while mitigating the issue of label noise inherent in distant supervision. Our approach achieves state-of-the-art results on minimally supervised sentential relation extraction, outperforming a number of baselines, including a competitive approach that uses the attention layer of a purely neural model.

[1]  Dan Klein,et al.  Neural CRF Parsing , 2015, ACL.

[2]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[3]  Jason Eisner,et al.  Local Search with Very Large-Scale Neighborhoods for Optimal Permutations in Machine Translation , 2006 .

[4]  Yu Zhang,et al.  Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning , 2017, WWW.

[5]  Ido Dagan,et al.  Open IE as an Intermediate Structure for Semantic Tasks , 2015, ACL.

[6]  Waleed Ammar,et al.  Improving Distant Supervision with Maxpooled Attention and Sentence-Level Supervision , 2018, ArXiv.

[7]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[8]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[9]  Dan Klein,et al.  An Empirical Investigation of Statistical Significance in NLP , 2012, EMNLP.

[10]  Dan Klein,et al.  Type-Based MCMC , 2010, HLT-NAACL.

[11]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[12]  Valentin I. Spitkovsky,et al.  A Simple Distant Supervision Approach for the TAC-KBP Slot Filling Task , 2010, TAC.

[13]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[14]  Heng Ji,et al.  Liberal Event Extraction and Event Schema Induction , 2016, ACL.

[15]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[16]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[17]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[18]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[19]  Andrew McCallum,et al.  Multilingual Relation Extraction using Compositional Universal Schema , 2015, NAACL.

[20]  Le Zhao,et al.  Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction , 2013, ACL.

[21]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[22]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[23]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[24]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[25]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[26]  David Bamman,et al.  Adversarial Training for Relation Extraction , 2017, EMNLP.

[27]  Oren Etzioni,et al.  Modeling Missing Data in Distant Supervision for Information Extraction , 2013, TACL.

[28]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[29]  Jun Zhao,et al.  Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks , 2015, EMNLP.

[30]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[31]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[32]  Christopher D. Manning,et al.  Improved Pattern Learning for Bootstrapped Entity Extraction , 2014, CoNLL.

[33]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[34]  Zhiyuan Liu,et al.  Relation Classification via Multi-Level Attention CNNs , 2016, ACL.

[35]  Andrew McCallum,et al.  Structured Relation Discovery using Generative Models , 2011, EMNLP.

[36]  Heike Adel,et al.  Noise Mitigation for Neural Entity Typing and Relation Extraction , 2016, EACL.

[37]  Andrew McCallum,et al.  Structured Prediction Energy Networks , 2015, ICML.

[38]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[39]  Mihai Surdeanu Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling and Temporal Slot Filling , 2013, TAC.

[40]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[41]  Regina Barzilay,et al.  Database-Text Alignment via Structured Multilabel Classification , 2007, IJCAI.

[42]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[43]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[44]  Niranjan Balasubramanian,et al.  Event Representations with Tensor-based Compositions , 2017, AAAI.

[45]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[46]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[47]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[48]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[49]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[50]  Daniel S. Weld,et al.  Autonomously semantifying wikipedia , 2007, CIKM '07.

[51]  Hongyu Guo,et al.  The Unreasonable Effectiveness of Word Representations for Twitter Named Entity Recognition , 2015, NAACL.

[52]  Christopher D. Manning,et al.  Combining Distant and Partial Supervision for Relation Extraction , 2014, EMNLP.

[53]  Luke S. Zettlemoyer,et al.  Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[54]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[55]  William Yang Wang,et al.  DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction , 2018, ACL.

[56]  Chang Li,et al.  Structured Representation Learning for Online Debate Stance Prediction , 2018, COLING.

[57]  Mark Dredze,et al.  Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction , 2015, HLT-NAACL.

[58]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[59]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.