Multi-View Consistency for Relation Extraction via Mutual Information and Structure Prediction

Relation Extraction (RE) is one of the fundamental tasks in Information Extraction. The goal of this task is to find the semantic relations between entity mentions in text. It has been shown in many previous work that the structure of the sentences (i.e., dependency trees) can provide important information/features for the RE models. However, the common limitation of the previous work on RE is the reliance on some external parsers to obtain the syntactic trees for the sentence structures. On the one hand, it is not guaranteed that the independent external parsers can offer the optimal sentence structures for RE and the customized structures for RE might help to further improve the performance. On the other hand, the quality of the external parsers might suffer when applied to different domains, thus also affecting the performance of the RE models on such domains. In order to overcome this issue, we introduce a novel method for RE that simultaneously induces the structures and predicts the relations for the input sentences, thus avoiding the external parsers and potentially leading to better sentence structures for RE. Our general strategy to learn the RE-specific structures is to apply two different methods to infer the structures for the input sentences (i.e., two views). We then introduce several mechanisms to encourage the structure and semantic consistencies between these two views so the effective structure and semantic representations for RE can emerge. We perform extensive experiments on the ACE 2005 and SemEval 2010 datasets to demonstrate the advantages of the proposed method, leading to the state-of-the-art performance on such datasets.

[1]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[2]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[5]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[6]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[7]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[8]  Ralph Grishman,et al.  Domain Adaptation for Relation Extraction with Domain Adversarial Neural Network , 2017, IJCNLP.

[9]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[10]  Heng Ji,et al.  A Dependency-Based Neural Network for Relation Classification , 2015, ACL.

[11]  Dan Roth,et al.  Exploiting Background Knowledge for Relation Extraction , 2010, COLING.

[12]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[13]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[14]  Ralph Grishman,et al.  Semi-supervised Relation Extraction with Large-scale Word Clustering , 2011, ACL.

[15]  Aaron C. Courville,et al.  Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.

[16]  Zhi Jin,et al.  Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths , 2015, EMNLP.

[17]  Zhiyuan Liu,et al.  Relation Classification via Multi-Level Attention CNNs , 2016, ACL.

[18]  Ralph Grishman,et al.  Semantic Representations for Domain Adaptation: A Case Study on the Tree Kernel-based Method for Relation Extraction , 2015, ACL.

[19]  Preslav Nakov,et al.  SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.

[20]  Hiroyuki Shindo,et al.  Relation Classification Using Segment-Level Attention-based CNN and Dependency-based RNN , 2019, NAACL-HLT.

[21]  Ralph Grishman,et al.  Graph Convolutional Networks With Argument-Aware Pooling for Event Detection , 2018, AAAI.

[22]  Ralph Grishman,et al.  Relation Extraction: Perspective from Convolutional Neural Networks , 2015, VS@HLT-NAACL.

[23]  Ralph Grishman,et al.  Combining Neural Networks and Log-linear Models to Improve Relation Extraction , 2015, ArXiv.

[24]  Andrew McCallum,et al.  Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction , 2018, NAACL.

[25]  Mark Dredze,et al.  Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction , 2015, HLT-NAACL.

[26]  Dejing Dou,et al.  Improving Cross-Domain Performance for Relation Extraction via Dependency Prediction and Information Flow Control , 2019, IJCAI.

[27]  Sandra Kübler,et al.  Domain Adaptation for Parsing , 2013, RANLP.

[28]  Ralph Grishman,et al.  Employing Word Representations and Regularization for Domain Adaptation of Relation Extraction , 2014, ACL.

[29]  Yue Zhang,et al.  N-ary Relation Extraction using Graph-State LSTM , 2018, EMNLP.

[30]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[31]  Aaron C. Courville,et al.  MINE: Mutual Information Neural Estimation , 2018, ArXiv.

[32]  Andrew McCallum,et al.  Linguistically-Informed Self-Attention for Semantic Role Labeling , 2018, EMNLP.

[33]  Heng Ji,et al.  Genre Separation Network with Adversarial Training for Cross-genre Relation Extraction , 2018, EMNLP.

[34]  Wei Lu,et al.  Attention Guided Graph Convolutional Networks for Relation Extraction , 2019, ACL.