Coordination Structures in Dependency Treebanks

Paratactic syntactic structures are notoriously difficult to represent in dependency formalisms. This has painful consequences such as high frequency of parsing errors related to coordination. In other words, coordination is a pending problem in dependency analysis of natural languages. This paper tries to shed some light on this area by bringing a systematizing view of various formal means developed for encoding coordination structures. We introduce a novel taxonomy of such approaches and apply it to treebanks across a typologically diverse range of 26 languages. In addition, empirical observations on convertibility between selected styles of representations are shown too.

[1]  Igor Boguslavsky,et al.  Dependency Treebank for Russian: Concept, Tools, Types of Information , 2000, COLING.

[2]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[3]  Erhard W. Hinrichs,et al.  Parsing Coordinations , 2009, EACL.

[4]  Vincenzo Lombardo Leonardo Lesmo Unit Coordination and Gapping in Dependency Theory , 1998, Workshop On Processing Of Dependency-Based Grammars.

[5]  Maria Antònia Martí,et al.  Cat3LB and Cast3LB: From Constituents to Dependencies , 2006, FinTAL.

[6]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[7]  Zdenek Zabokrtský,et al.  Prague Dependency Style Treebank for Tamil , 2012, LREC.

[8]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[9]  Mihaela Călăcean,et al.  Data-driven Dependency Parsing for Romanian , 2008 .

[10]  Saso Dzeroski,et al.  Towards a Slovene Dependency Treebank , 2006, LREC.

[11]  Sylvain Kahane BUBBLE TREES AND SYNTACTIC REPRESENTATIONS , 1997 .

[12]  Kemal Oflazer,et al.  The Annotation Process in the Turkish Treebank , 2003, LINC@EACL.

[13]  C. M. Sperberg-McQueen,et al.  Guidelines for electronic text encoding and interchange , 1994 .

[14]  Representing Layered and Structured Data in the CoNLL-ST Format , 2010 .

[15]  Eduard H. Hovy,et al.  A Fast, Accurate, Non-Projective, Semantically-Enriched Parser , 2011, EMNLP.

[16]  Otakar Smrž Viktor Bielický Iveta Kouřilová Jakub Kráčmar Zemánek Dependency Treebank : A Word on the Million Words , 2008 .

[17]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[18]  Daniel Zeman,et al.  HamleDT: To Parse or Not to Parse? , 2012, LREC.

[19]  Eckhard Bick,et al.  Floresta Sintá(c)tica: A treebank for Portuguese , 2002, LREC.

[20]  Sabine Brants,et al.  The TIGER Treebank , 2001 .

[21]  Joakim Nivre,et al.  MAMBA Meets TIGER: Reconstructing a Swedish Treebank from Antiquity , 2005 .

[22]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[23]  János Csirik,et al.  The Szeged Treebank , 2005, TSD.

[24]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[25]  Stelios Piperidis,et al.  Theoretical and Practical Issues in the Construction of a Greek Dependency Treebank , 2005 .

[26]  Jirka Hana,et al.  Prague Markup Language Framework , 2012, LAW@ACL.

[27]  Nicolas Mazziotta Coordination of verbal dependents in Old French : Coordination as a specified juxtaposition or apposition , 2011 .

[28]  Díaz de Ilarraza Construction of a Basque Dependency Treebank , 2003 .

[29]  Roberto Basili,et al.  Building the Italian Syntactic-Semantic Treebank , 2003 .

[30]  Prashanth Mannem,et al.  The ICON-2010 tools contest on Indian language dependency parsing , 2010 .

[31]  Mohammad Sadegh Rasooli,et al.  A Syntactic Valency Lexicon for Persian Verbs : The First Steps towards Persian Dependency Treebank , 2012 .

[32]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[33]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[34]  Zdenek Zabokrtský,et al.  Improving English-Czech Tectogrammatical MT , 2009, Prague Bull. Math. Linguistics.

[35]  LEON STASSEN,et al.  AND-languages and WITH-languages , 2000 .

[36]  Nathan Green,et al.  Hybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser , 2012 .

[37]  David Bamman,et al.  The Ancient Greek and Latin Dependency Treebanks , 2011, Language Technology for Cultural Heritage.

[38]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[39]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[40]  Khalid Choukri,et al.  The european language resources association , 1998, LREC.