论文信息 - Deep Semantic Role Labeling with Self-Attention - 字舞流文

Deep Semantic Role Labeling with Self-Attention

Semantic Role Labeling (SRL) is believed to be a crucial step towards natural language understanding and has been widely studied. Recent years, end-to-end SRL with recurrent neural networks (RNN) has gained increasing attention. However, it remains a major challenge for RNNs to handle structural information and long range dependencies. In this paper, we present a simple and effective architecture for SRL which aims to address these problems. Our model is based on self-attention which can directly capture the relationships between two tokens regardless of their distance. Our single model achieves F$_1=83.4$ on the CoNLL-2005 shared task dataset and F$_1=82.7$ on the CoNLL-2012 shared task dataset, which outperforms the previous state-of-the-art results by $1.8$ and $1.0$ F$_1$ score respectively. Besides, our model is computationally efficient, and the parsing speed is 50K tokens per second on a single Titan X GPU.

Yidong Chen | Jun Xie | Xiaodong Shi | Zhixing Tan | Mingxuan Wang | Yidong Chen | Zhixing Tan | X. Shi | Mingxuan Wang | Jun Xie

[1] Sanda M. Harabagiu,et al. Open Domain Information Extraction via Automatic Semantic Labeling , 2003, FLAIRS Conference.

[2] Daniel Gildea,et al. Automatic Labeling of Semantic Roles , 2000, ACL.

[3] Dan Roth,et al. The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[4] Kevin Knight,et al. Building a Large-Scale Knowledge Base for Machine Translation , 1994, AAAI.

[5] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[6] Mitchell P. Marcus,et al. Adding Semantic Annotation to the Penn TreeBank , 1998 .

[7] Christopher D. Manning,et al. A Global Joint Model for Semantic Role Labeling , 2008, CL.

[8] Pascale Fung,et al. Semantic Roles for SMT: A Hybrid Two-Pass Model , 2009, NAACL.

[9] Xavier Carreras,et al. Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling , 2005, CoNLL.

[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[11] Guy Lapalme,et al. Framework for Abstractive Summarization using Text-to-Text Generation , 2011, Monolingual@ACL.

[12] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[13] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[14] Hwee Tou Ng,et al. Towards Robust Linguistic Analysis using OntoNotes , 2013, CoNLL.

[15] Mirella Lapata,et al. Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[16] Kuzman Ganchev,et al. Efficient Inference and Structured Learning for Semantic Role Labeling , 2015, TACL.

[17] Roberto Basili,et al. Textual Inference and Meaning Representation in Human Robot Interaction , 2013, JSSP.

[18] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.

[19] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[20] Kuzman Ganchev,et al. Semantic Role Labeling with Neural Network Factors , 2015, EMNLP.

[21] Bowen Zhou,et al. A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[22] Sanda M. Harabagiu,et al. Using Predicate-Argument Structures for Information Extraction , 2003, ACL.

[23] Dan Roth,et al. Generalized Inference with Multiple Semantic Role Labeling Systems , 2005, CoNLL.

[24] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[26] Tao Shen,et al. DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.

[27] Luke S. Zettlemoyer,et al. Deep Semantic Role Labeling: What Works and What’s Next , 2017, ACL.

[28] Gholamreza Haffari,et al. Transductive learning for statistical machine translation , 2007, ACL.

[29] Diego Marcheggiani,et al. A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling , 2017, CoNLL.

[30] Cícero Nogueira dos Santos,et al. Semantic Role Labeling , 2012 .

[31] Wei Xu,et al. End-to-end learning of semantic role labeling using recurrent neural networks , 2015, ACL.

[32] Daniel Jurafsky,et al. Semantic Role Chunking Combining Complementary Syntactic Views , 2005, CoNLL.

[33] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[34] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[35] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[36] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[37] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[38] Mirella Lapata,et al. Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[39] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[40] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[41] Jakob Uszkoreit,et al. A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[42] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).