Open-Domain Semantic Role Labeling by Modeling Word Spans

Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Semantic role labeling techniques are typically trained on newswire text, and in tests their performance on fiction is as much as 19% worse than their performance on newswire text. We investigate techniques for building open-domain semantic role labeling systems that approach the ideal of a train-once, use-anywhere system. We leverage recently-developed techniques for learning representations of text using latent-variable language models, and extend these techniques to ones that provide the kinds of features that are useful for semantic role labeling. In experiments, our novel system reduces error by 16% relative to the previous state of the art on out-of-domain text.

[1]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[2]  Mirella Lapata,et al.  Graph Alignment for Semi-Supervised Semantic Role Labeling , 2009, EMNLP.

[3]  Christopher D. Manning,et al.  A Global Joint Model for Semantic Role Labeling , 2008, CL.

[4]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[5]  Phil Blunsom,et al.  Semantic Role Labelling with Tree Conditional Random Fields , 2005, CoNLL.

[6]  Wayne H. Ward,et al.  Towards Robust Semantic Role Labeling , 2007, CL.

[7]  Alexander Yates,et al.  Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling , 2009, ACL.

[8]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[9]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[10]  Jordi Girona Salgado An Empirical Study of the Domain Dependence of Supervised Word Sense Disambiguation Systems , 2000 .

[11]  Xavier Carreras,et al.  Introduction to the CoNLL-2004 Shared Task: Semantic Role Labeling , 2004, CoNLL.

[12]  Ari Rappoport,et al.  Unsupervised Argument Identification for Semantic Role Labeling , 2009, ACL.

[13]  Mirella Lapata,et al.  Semi-Supervised Semantic Role Labeling , 2009, EACL.

[14]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[15]  Doug Downey,et al.  Sparse Information Extraction: Unsupervised Language Models to the Rescue , 2007, ACL.

[16]  Daniel Gildea,et al.  Corpus Variation and Parser Performance , 2001, EMNLP.

[17]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[18]  Doug Downey,et al.  Locating Complex Named Entities in Web Text , 2007, IJCAI.

[19]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[20]  Suzanne Stevenson,et al.  Unsupervised Semantic Role Labellin , 2004, EMNLP.

[21]  Christopher D. Manning,et al.  Unsupervised Discovery of a Statistical Verb Lexicon , 2006, EMNLP.

[22]  Daniel Jurafsky,et al.  Semantic Role Chunking Combining Complementary Syntactic Views , 2005, CoNLL.

[23]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[24]  Marie-Francine Moens,et al.  Semi-supervised Semantic Role Labeling Using the Latent Words Language Model , 2009, EMNLP.

[25]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[26]  Xavier Carreras,et al.  Phrase recognition by filtering and ranking with perceptrons , 2003, RANLP.

[27]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[28]  Lluís Màrquez i Villodre,et al.  An Empirical Study of the Domain Dependence of Supervised Word Disambiguation Systems , 2000, EMNLP.

[29]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[30]  Eugene Charniak,et al.  Reranking and Self-Training for Parser Adaptation , 2006, ACL.

[31]  Brian Roark,et al.  MAP adaptation of stochastic grammars , 2006, Comput. Speech Lang..

[32]  Xavier Carreras,et al.  Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling , 2005, CoNLL.