Sentences with Gapping: Parsing and Reconstructing Elided Predicates

Sentences with gapping, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments. Surface syntax representations of such sentences are often produced poorly by parsers, and even if correct, not well suited to downstream natural language understanding tasks such as relation extraction that are typically designed to extract information from sentences with canonical clause structure. In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges. We find that both methods can reconstruct elided material from dependency trees with high accuracy when the parser correctly predicts the existence of a gap. We further demonstrate that one of our methods can be applied to other languages based on a case study on Swedish.

[1]  Ray Jackendoff,et al.  Gapping and related rules , 1970 .

[2]  Shalom Lappin,et al.  A Sequenced Model of Anaphora and Ellipsis Resolution , 2005 .

[3]  Annahita Farudi,et al.  Gapping in Farsi: A crosslinguistic investigation , 2013 .

[4]  Richard Campbell,et al.  Using Linguistic Principles to Recover Empty Categories , 2004, ACL.

[5]  Samuel R. Bowman,et al.  A Gold Standard Dependency Corpus for English , 2014, LREC.

[6]  Amit Dubey,et al.  Antecedent Recovery: Experiments with a Trace Tagger , 2003, EMNLP.

[7]  Yoav Goldberg,et al.  Language-Independent Parsing with Empty Elements , 2011, ACL.

[8]  Petter Haugereid An incremental approach to gapping and conjunction reduction , 2017 .

[9]  S. T. Buckland,et al.  Computer-Intensive Methods for Testing Hypotheses. , 1990 .

[10]  Mark Steedman,et al.  Gapping as constituent coordination , 1990 .

[11]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[12]  Mark Johnson,et al.  A Simple Pattern-matching Algorithm for Recovering Empty Nodes and their Antecedents , 2002, ACL.

[13]  Veronika Vincze,et al.  Hungarian Copula Constructions in Dependency Syntax and Parsing , 2017, DepLing.

[14]  Alexander S. Yeh,et al.  More accurate tests for the statistical significance of result differences , 2000, COLING.

[15]  Roger Levy,et al.  Deep Dependencies from Context-Free Statistical Parsers: Correcting the Surface Dependency Approximation , 2004, ACL.

[16]  Seth Kulick,et al.  Fully Parsing the Penn Treebank , 2006, NAACL.

[17]  Christopher D. Manning,et al.  Gapping Constructions in Universal Dependencies v2 , 2017, UDW@NoDaLiDa.

[18]  John Robert Ross,et al.  GAPPING AND THE ORDER OF CONSTITUENTS , 1970 .

[19]  Christopher D. Manning,et al.  Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks , 2016, LREC.

[20]  Timothy Dozat,et al.  Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task , 2017, CoNLL.

[21]  Nizar Habash,et al.  One-Step Statistical Parsing of Hybrid Dependency-Constituency Syntactic Representations , 2011, IWPT.

[22]  Helmut Schmid Trace Prediction and Recovery with Unlexicalized PCFGs and Slash Features , 2006, ACL.

[23]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[24]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[25]  Marjorie McShane,et al.  Detection and Resolution of Verb Phrase Ellipsis , 2016, LILT.

[26]  Daniel Hardt,et al.  Antecedent Selection for Sluicing: Structure and Content , 2016, EMNLP.

[27]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[28]  Yusuke Miyao,et al.  SemEval 2015 Task 18: Broad-Coverage Semantic Dependency Parsing , 2015, *SEMEVAL.

[29]  Jin-Dong Kim,et al.  The GENIA corpus: an annotated research abstract corpus in molecular biology domain , 2002 .

[30]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[31]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[32]  Jonas Kuhn,et al.  Data-driven Dependency Parsing With Empty Heads , 2012, COLING.

[33]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[34]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[35]  Daniel Zeman,et al.  Elliptic Constructions: Spotting Patterns in UD Treebanks , 2017, UDW@NoDaLiDa.

[36]  Dan Klein,et al.  Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank , 2001, ACL.

[37]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[38]  Leif Arda Nielsen,et al.  Verb Phrase Ellipsis detection using Automatically Parsed Text , 2004, COLING.

[39]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[40]  Dan Klein,et al.  Parsing with Traces: An O(n4) Algorithm and a Structural Representation , 2017, TACL.

[41]  Maziar Toosarvandani,et al.  Embedding the Antecedent in Gapping: Low Coordination and the Role of Parallelism , 2016, Linguistic Inquiry.

[42]  Martin Kay,et al.  Algorithm schemata and data structures in syntactic processing , 1986 .

[43]  Shigeki Matsubara,et al.  Transition-Based Left-Corner Parsing for Identifying PTB-Style Nonlocal Dependencies , 2016, ACL.

[44]  Amit Dubey,et al.  Deep Syntactic Processing by Combining Shallow Methods , 2003, ACL.

[45]  Yusuke Kubota,et al.  Gapping as hypothetical reasoning , 2016 .

[46]  Stephan Oepen,et al.  Broad-Coverage Semantic Dependency Parsing , 2014 .

[47]  Yoav Goldberg,et al.  Improved Parsing for Argument-Clusters Coordination , 2016, ACL.

[48]  Masaaki Nagata,et al.  Empty element recovery by spinal parser operations , 2016, ACL.

[49]  Nizar Habash,et al.  CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2017, CoNLL.

[50]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[51]  Timothy Osborne Gapping vs. non-gapping coordination , 2006 .

[52]  Daniel Hardt,et al.  An Empirical Approach to VP Ellipsis , 1997, CL.