Sister Help: Data Augmentation for Frame-Semantic Role Labeling

While FrameNet is widely regarded as a rich resource of semantics in natural language processing, a major criticism concerns its lack of coverage and the relative paucity of its labeled data compared to other commonly used lexical resources such as PropBank and VerbNet. This paper reports on a pilot study to address these gaps. We propose a data augmentation approach, which uses existing frame-specific annotation to automatically annotate other lexical units of the same frame which are unannotated. Our rule-based approach defines the notion of a **sister lexical unit** and generates frame-specific augmented data for training. We present experiments on frame-semantic role labeling which demonstrate the importance of this data augmentation: we obtain a large improvement to prior results on frame identification and argument identification for FrameNet, utilizing both full-text and lexicographic annotations under FrameNet. Our findings on data augmentation highlight the value of automatic resource creation for improved models in frame-semantic parsing.

[1]  Xavier Carreras,et al.  Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling , 2005, CoNLL.

[2]  Adam Janin,et al.  Mutaphrase: Paraphrasing with FrameNet , 2007, ACL-PASCAL@ACL.

[3]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[4]  Josef Ruppenhofer,et al.  The framenet project: tools for lexicon building , 2001 .

[5]  Hwee Tou Ng,et al.  Towards Robust Linguistic Analysis using OntoNotes , 2013, CoNLL.

[6]  Bernardo Gonçalves,et al.  Augmenting Linguistic Semi-Structured Data for Machine Learning - A Case Study using Framenet , 2020 .

[7]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[8]  Chris Callison-Burch,et al.  FrameNet+: Fast Paraphrastic Tripling of FrameNet , 2015, ACL.

[9]  Donghong Ji,et al.  Second-Order Semantic Role Labeling With Global Structural Refinement , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Noah A. Smith,et al.  Probabilistic Frame-Semantic Parsing , 2010, NAACL.

[11]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[12]  Benjamin Van Durme,et al.  Augmenting FrameNet Via PPDB , 2014, EVENTS@ACL.

[13]  Luke S. Zettlemoyer,et al.  Large-Scale QA-SRL Parsing , 2018, ACL.

[14]  Luke S. Zettlemoyer,et al.  Syntactic Scaffolds for Semantic Structures , 2018, EMNLP.

[15]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[16]  Noah A. Smith Linguistic Structure Prediction , 2011, Synthesis Lectures on Human Language Technologies.

[17]  Noah A. Smith,et al.  Frame-Semantic Parsing with Softmax-Margin Segmental RNNs and a Syntactic Scaffold , 2017, ArXiv.

[18]  Emily M. Bender,et al.  Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data , 2020, ACL.

[19]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[20]  Katrin Erk,et al.  SemEval-2007 Task 19: Frame Semantic Structure Extraction , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[21]  Bolette S. Pedersen Lexical Ambiguity in Machine Translation: Using Frame Semantics for Expressing Systemacies in Polysemy , 2008 .

[22]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[23]  Charles J. Fillmore,et al.  Frames and the semantics of understanding , 1985 .

[24]  Mirella Lapata,et al.  Neural Semantic Role Labeling with Dependency Path Embeddings , 2016, ACL.

[25]  Tom M. Mitchell,et al.  A Joint Sequential and Relational Model for Frame-Semantic Parsing , 2017, EMNLP.

[26]  Noah A. Smith,et al.  Frame-Semantic Parsing , 2014, CL.

[27]  Ido Dagan,et al.  Synthesis Lectures on Human Language Technologies , 2009 .

[28]  Zhirui Hu,et al.  Text Summarization Using FrameNet-Based Semantic Graph Model , 2016, Sci. Program..

[29]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[30]  Iryna Gurevych,et al.  Assessing SRL Frameworks with Automatic Training Data Expansion , 2017, LAW@ACL.

[31]  Josef Ruppenhofer,et al.  FrameNet II: Extended theory and practice , 2006 .