Adversarial training for multi-context joint entity and relation extraction

Adversarial training (AT) is a regularization method that can be used to improve the robustness of neural network methods by adding small perturbations in the training data. We show how to use AT for the tasks of entity recognition and relation extraction. In particular, we demonstrate that applying AT to a general purpose baseline model for jointly extracting entities and relations, allows improving the state-of-the-art effectiveness on several datasets in different contexts (i.e., news, biomedical, and real estate data) and for different languages (English and Dutch).

[1]  Yue Zhang,et al.  Joint Models for Extracting Adverse Drug Events from Biomedical Text , 2016, IJCAI.

[2]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5]  Thomas Demeester,et al.  An attentive neural architecture for joint segmentation and parsing and its application to real estate ads , 2017, Expert Syst. Appl..

[6]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[7]  Andrew M. Dai,et al.  Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[8]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[9]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[10]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[11]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[12]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[13]  Makoto Miwa,et al.  Modeling Joint Entity and Relation Extraction with Table Representation , 2014, EMNLP.

[14]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[15]  Thomas Demeester,et al.  Reconstructing the house from the ad: Structured prediction on real estate classifieds , 2017, EACL.

[16]  Jungo Kasai,et al.  Robust Multilingual Part-of-Speech Tagging via Adversarial Training , 2017, NAACL.

[17]  Hinrich Schütze,et al.  Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction , 2016, COLING.

[18]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[19]  Chris Develder,et al.  Joint entity recognition and relation extraction as a multi-head selection problem , 2018, Expert Syst. Appl..

[20]  Heng Ji,et al.  Incremental Joint Extraction of Entity Mentions and Relations , 2014, ACL.

[21]  Juliane Fluck,et al.  Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports , 2012, J. Biomed. Informatics.

[22]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[23]  Mirella Lapata,et al.  Dependency Parsing as Head Selection , 2016, EACL.

[24]  David Bamman,et al.  Adversarial Training for Relation Extraction , 2017, EMNLP.

[25]  Heike Adel,et al.  Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification , 2017, EMNLP.

[26]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[27]  Bo Xu,et al.  Joint entity and relation extraction based on a hybrid neural network , 2017, Neurocomputing.

[28]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[29]  Claire Cardie,et al.  Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees , 2017, ACL.

[30]  Fei Li,et al.  A neural joint model for entity and relation extraction from biomedical text , 2017, BMC Bioinformatics.