Deep Syntax Annotation of the Sequoia French Treebank

We define a deep syntactic representation scheme for French, which abstracts away from surface syntactic variation and diathesis alternations, and describe the annotation of deep syntactic representations on top of the surface dependency trees of the Sequoia corpus. The resulting deep-annotated corpus, named DEEP - SEQUOIA, is freely available, and hopefully useful for corpus linguistics studies and for training deep analyzers to prepare semantic analysis.

[1]  Karën Fort,et al.  Les ressources annotées, un enjeu pour l'analyse de contenu : vers une méthodologie de l'annotation manuelle de corpus. (Annotated resources, a key issue in content analysis : towards a methodology for manual corpus annotation) , 2012 .

[2]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[3]  Sylvain Kahane,et al.  On the status of the deep-syntactic structure , 2003 .

[4]  Carol Neidle,et al.  Lexical Functional Grammar , 1998 .

[5]  Marie Candito,et al.  Effectively long-distance dependencies in French : annotation and parsing evaluation , 2012 .

[6]  Éric Villemonte de la Clergerie,et al.  A linguistically-motivated 2-stage Tree to Graph Transformation , 2012, TAG.

[7]  David R. Dowty Thematic proto-roles and argument selection , 1991 .

[8]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[9]  Jun'ichi Tsujii,et al.  Probabilistic Disambiguation Models for Wide-Coverage HPSG Parsing , 2005, ACL.

[10]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[11]  Noah A. Smith,et al.  Dependency Parsing , 2009, Encyclopedia of Artificial Intelligence.

[12]  Brian D. Joseph,et al.  Studies in relational grammar , 1984 .

[13]  Anne Abeillé,et al.  Enriching a French Treebank , 2004, LREC.

[14]  Josef van Genabith,et al.  Dependency Parsing Resources for French: Converting Acquired Lexical Functional Grammar F-Structure Annotations and Parsing F-Structures Directly , 2009, NODALIDA.

[15]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[16]  Bruno Guillaume,et al.  Grew : un outil de réécriture de graphes pour le TAL (Grew: a Graph Rewriting Tool for NLP) [in French] , 2012, JEP-TALN-RECITAL.

[17]  Leo Wanner,et al.  AnCora-UPF: A Multi-Level Annotation of Spanish , 2013, DepLing.

[18]  Marie Candito,et al.  Le corpus Sequoia : annotation syntaxique et exploitation pour l’adaptation d’analyseur par pont lexical (The Sequoia Corpus : Syntactic Annotation and Use for a Parser Lexical Domain Adaptation Method) [in French] , 2012, JEP/TALN/RECITAL.

[19]  Marie Candito Organisation modulaire et parametrable de grammaires electroniques lexicalisees application du francais et a l'italien , 1999 .

[20]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[21]  Bruno Guillaume,et al.  Enrichissement de structures en dépendances par réécriture de graphes (Dependency structure enrichment using graph rewriting) , 2011, JEPTALNRECITAL.

[22]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[23]  Leo Wanner,et al.  A development Environment for an MTT-Based Sentence Generator , 2000, INLG.

[24]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[25]  Andy Way,et al.  Long-Distance Dependency Resolution in Automatically Acquired Wide-Coverage PCFG-Based LFG Approximations , 2004, ACL.

[26]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[27]  Pascal Denis,et al.  Statistical French Dependency Parsing: Treebank Conversion and First Results , 2010, LREC.