Semantic Role Labeling with Iterative Structure Refinement

Modern state-of-the-art Semantic Role Labeling (SRL) methods rely on expressive sentence encoders (e.g., multi-layer LSTMs) but tend to model only local (if any) interactions between individual argument labeling decisions. This contrasts with earlier work and also with the intuition that the labels of individual arguments are strongly interdependent. We model interactions between argument labeling decisions through iterative refinement. Starting with an output produced by a factorized model, we iteratively refine it using a refinement network. Instead of modeling arbitrary interactions among roles and words, we encode prior knowledge about the SRL problem by designing a restricted network architecture capturing non-local interactions. This modeling choice prevents overfitting and results in an effective model, outperforming strong factorized baseline models on all 7 CoNLL-2009 languages, and achieving state-of-the-art results on 5 of them, including English.

[1]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[2]  Andrew McCallum,et al.  Structured Prediction Energy Networks , 2015, ICML.

[3]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[4]  Yuji Matsumoto,et al.  A Structured Model for Joint Learning of Argument Roles and Predicate Senses , 2010, ACL.

[5]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[6]  Ivan Titov,et al.  Multilingual Joint Parsing of Syntactic and Semantic Dependencies with a Latent Variable Model , 2013, CL.

[7]  Oren Etzioni,et al.  An analysis of open information extraction based on semantic role labeling , 2011, K-CAP '11.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Stephan Oepen,et al.  SemEval 2014 Task 8: Broad-Coverage Semantic Dependency Parsing , 2014, *SEMEVAL.

[11]  Xinchi Chen,et al.  Capturing Argument Interaction in Semantic Role Labeling with Capsule Networks , 2019, EMNLP.

[12]  Christopher Joseph Pal,et al.  Learning normalized inputs for iterative estimation in medical image segmentation , 2017, Medical Image Anal..

[13]  Kewei Tu,et al.  Second-Order Semantic Dependency Parsing with End-to-End Neural Networks , 2019, ACL.

[14]  Jason Lee,et al.  Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.

[15]  Noah A. Smith,et al.  An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints , 2012, *SEMEVAL.

[16]  Stephan Oepen,et al.  Broad-Coverage Semantic Dependency Parsing , 2014 .

[17]  Yoshua Bengio,et al.  Image Segmentation by Iterative Inference from Conditional Score Estimation , 2017, ArXiv.

[18]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[19]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[20]  Hai Zhao,et al.  Multilingual Dependency Learning: Exploiting Rich Features for Tagging Syntactic and Semantic Dependencies , 2009, CoNLL Shared Task.

[21]  Christopher D. Manning,et al.  A Global Joint Model for Semantic Role Labeling , 2008, CL.

[22]  Hai Zhao,et al.  A Full End-to-End Semantic Role Labeler, Syntactic-agnostic Over Syntactic-aware? , 2018, COLING.

[23]  Kuzman Ganchev,et al.  Semantic Role Labeling with Neural Network Factors , 2015, EMNLP.

[24]  Diego Marcheggiani,et al.  Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks , 2018, NAACL.

[25]  Roman Novak,et al.  Iterative Refinement for Machine Translation , 2016, ArXiv.

[26]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[27]  Andrew McCallum,et al.  End-to-End Learning for Structured Prediction Energy Networks , 2017, ICML.

[28]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[29]  Mirella Lapata,et al.  Neural Semantic Role Labeling with Dependency Path Embeddings , 2016, ACL.

[30]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[31]  Nenghai Yu,et al.  Deliberation Networks: Sequence Generation Beyond One-Pass Decoding , 2017, NIPS.

[32]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[33]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[34]  André F. T. Martins,et al.  Learning with Fenchel-Young Losses , 2020, J. Mach. Learn. Res..

[35]  Andrew McCallum,et al.  Linguistically-Informed Self-Attention for Semantic Role Labeling , 2018, EMNLP.

[36]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[37]  Noah A. Smith,et al.  Polyglot Semantic Role Labeling , 2018, ACL.

[38]  Luke S. Zettlemoyer,et al.  Deep Semantic Role Labeling: What Works and What’s Next , 2017, ACL.

[39]  Diego Marcheggiani,et al.  A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling , 2017, CoNLL.

[40]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[41]  Lifu Tu,et al.  Benchmarking Approximate Inference Methods for Neural Structured Prediction , 2019, NAACL.

[42]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[43]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[44]  Timothy Dozat,et al.  Simpler but More Accurate Semantic Dependency Parsing , 2018, ACL.

[45]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[46]  Lijun Wu,et al.  Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[47]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[48]  Noah A. Smith,et al.  Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions , 2010, NAACL.

[49]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[50]  Lifu Tu,et al.  Learning Approximate Inference Networks for Structured Prediction , 2018, ICLR.

[51]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[52]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[53]  Hai Zhao,et al.  Dependency or Span, End-to-End Uniform Semantic Role Labeling , 2019, AAAI.