Rebanking CCGbank for Improved NP Interpretation

Once released, treebanks tend to remain unchanged despite any shortcomings in their depth of linguistic analysis or coverage of specific phenomena. Instead, separate resources are created to address such problems. In this paper we show how to improve the quality of a treebank, by integrating resources and implementing improved analyses for specific constructions. We demonstrate this rebanking process by creating an updated version of CCG-bank that includes the predicate-argument structure of both verbs and nouns, base-NP brackets, verb-particle constructions, and restrictive and non-restrictive nominal modifiers; and evaluate the impact of these changes on a statistical parser.

[1]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[2]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[3]  Daniel Gildea,et al.  Identifying Semantic Roles Using Combinatory Categorial Grammar , 2003, EMNLP.

[4]  James R. Curran,et al.  Integrating Verb-Particle Constructions into CCG Parsing , 2009, ALTA.

[5]  James R. Curran,et al.  Parsing Noun Phrase Structure with CCG , 2008, ACL.

[6]  Andy Way,et al.  Wide-Coverage Deep Statistical Parsing Using Automatic Dependency Structure Annotation , 2008, CL.

[7]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[8]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[9]  Paul R. Martin Guide to business style and usage , 2002 .

[10]  Ronald M. Kaplan,et al.  Lexical Functional Grammar A Formal System for Grammatical Representation , 2004 .

[11]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[12]  Christopher D. Manning,et al.  LinGO Redwoods A Rich and Dynamic Treebank for HPSG , 2002 .

[13]  N. Curteanu Book Reviews: Lecture on Contemporary Syntactic Theories: An Introduction to Unification-Based Approaches to Grammar , 1987, CL.

[14]  Michael White,et al.  Projecting Propbank Roles onto the CCGbank , 2008, LREC.

[15]  Mark Steedman,et al.  Acquiring Compact Lexicalized Grammars from a Cleaner Treebank , 2002, LREC.

[16]  Miriam Butt,et al.  Lexical semantics in LFG , 2006 .

[17]  Beth Ann Hockey,et al.  XTAG System - A Wide Coverage Grammar for English , 1994, COLING.

[18]  James R. Curran,et al.  Punctuation Normalisation for Cleaner Treebanks and Parsers , 2008, ALTA.

[19]  Stephan Oepen,et al.  LinGO Redwoods , 2004 .

[20]  Ralph Grishman,et al.  The NomBank Project: An Interim Report , 2004, FCP@NAACL-HLT.

[21]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[22]  J. Curran,et al.  Improving the complement / adjunct distinction in CCGbank , 2007 .

[23]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[24]  FlickingerDan On building a more efficient grammar by exploiting types , 2000 .

[25]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[26]  Jun'ichi Tsujii,et al.  Corpus-Oriented Grammar Development for Acquiring a Head-Driven Phrase Structure Grammar from the Penn Treebank , 2004, IJCNLP.

[27]  James R. Curran,et al.  Adding Noun Phrase Structure to the Penn Treebank , 2007, ACL.

[28]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[29]  Libin Shen,et al.  A new resource for incremental, dependency and semantic parsing , 2008 .