Inducing Semantic Roles Without Syntax

Semantic roles are a key component of linguistic predicate-argument structure, but developing ontologies of these roles requires significant expertise and manual effort. Methods exist for automatically inducing semantic roles using syntactic representations, but syntax can also be difficult to define, annotate, and predict. We show it is possible to automatically induce semantic roles from QA-SRL, a scalable and ontology-free semantic annotation scheme that uses question-answer pairs to represent predicate-argument structure. By associating arguments with distributions over QASRL questions and clustering them in a mixture model, our method outperforms all previous models as well as a new state-of-the-art baseline over gold syntax. We show that our method works because QA-SRL acts as surrogate syntax, capturing non-overt arguments and syntactic alternations, which are central motivators for the use of semantic role labeling systems.1

[1]  Luke S. Zettlemoyer,et al.  Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language , 2015, EMNLP.

[2]  Ivan Titov,et al.  Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework , 2014, NAACL.

[3]  J. Bresnan Lexical-Functional Syntax , 2000 .

[4]  Igor Malioutov,et al.  Learning Syntax from Naturally-Occurring Bracketings , 2021, NAACL.

[5]  Iryna Gurevych,et al.  A Matter of Framing: The Impact of Linguistic Formalism on Probing Results , 2020, EMNLP.

[6]  Richard Johansson,et al.  The CoNLL 2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies , 2008, CoNLL.

[7]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[8]  Luke S. Zettlemoyer,et al.  Deep Semantic Role Labeling: What Works and What’s Next , 2017, ACL.

[9]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[10]  Nikita Kitaev,et al.  Unsupervised Parsing via Constituency Tests , 2020, EMNLP.

[11]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[12]  Mark Steedman,et al.  Surface structure and interpretation , 1996, Linguistic inquiry.

[13]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[14]  Oren Etzioni,et al.  An analysis of open information extraction based on semantic role labeling , 2011, K-CAP '11.

[15]  Luke S. Zettlemoyer,et al.  Large-Scale QA-SRL Parsing , 2018, ACL.

[16]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[17]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[18]  Dipanjan Das,et al.  BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.

[19]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[20]  Julian Michael,et al.  Asking without Telling: Exploring Latent Ontologies in Contextual Representations , 2020, EMNLP.

[21]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[22]  Ido Dagan,et al.  Controlled Crowdsourcing for High-Quality QA-SRL Annotation , 2019, ACL.

[23]  Kenton Lee,et al.  Learning Recurrent Span Representations for Extractive Question Answering , 2016, ArXiv.

[24]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[25]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2009, Information Retrieval.

[26]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[27]  Ari Rappoport,et al.  Universal Conceptual Cognitive Annotation (UCCA) , 2013, ACL.

[28]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[29]  J. Toomasian The Case for the Case , 2016, Perfusion.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Mirella Lapata,et al.  Unsupervised Induction of Semantic Roles , 2010, HLT-NAACL.

[32]  David A. McAllester,et al.  Machine Comprehension with Syntax, Frames, and Semantics , 2015, ACL.

[33]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[34]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[35]  Olga Babko-Malaya,et al.  PropBank Annotation Guidelines , 2010 .

[36]  Ido Dagan,et al.  QANom: Question-Answer driven SRL for Nominalizations , 2020, COLING.

[37]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[38]  Mirella Lapata,et al.  Similarity-Driven Semantic Role Induction via Graph Partitioning , 2014, CL.

[39]  Wei Xu,et al.  End-to-end learning of semantic role labeling using recurrent neural networks , 2015, ACL.

[40]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[41]  Ivan Titov,et al.  A Bayesian Approach to Unsupervised Semantic Role Induction , 2012, EACL.

[42]  Mirella Lapata,et al.  Distributed Representations for Unsupervised Semantic Role Labeling , 2015, EMNLP.

[43]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[44]  Boyang Li,et al.  Multiplicative Representations for Unsupervised Semantic Role Induction , 2016, ACL.

[45]  Emily M. Bender,et al.  The Grammar Matrix: An Open-Source Starter-Kit for the Rapid Development of Cross-linguistically Consistent Broad-Coverage Precision Grammars , 2002, COLING 2002.

[46]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[47]  Jeffrey Gruber Studies in lexical relations , 1965 .

[48]  Grzegorz Chrupala,et al.  Hierarchical clustering of word class distributions , 2012, HLT-NAACL 2012.

[49]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[50]  Mirella Lapata,et al.  Unsupervised Semantic Role Induction via Split-Merge Clustering , 2011, ACL.

[51]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.