Universal Dependencies v1: A Multilingual Treebank Collection

Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. In this paper, we describe v1 of the universal guidelines, the underlying design principles, and the currently available treebanks for 33 languages.

[1]  Tapio Salakoski,et al.  Building the essential resources for Finnish: the Turku Dependency Treebank , 2013, Language Resources and Evaluation.

[2]  Reut Tsarfaty,et al.  A Unified Morpho-Syntactic Scheme of Stanford Dependencies , 2013, ACL.

[3]  Philip Resnik,et al.  Cross-Language Parser Adaptation between Related Languages , 2008, IJCNLP.

[4]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[5]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[6]  Daniel Zeman,et al.  Reusable Tagset Conversion Using Tagset Drivers , 2008, LREC.

[7]  Petr Pajas,et al.  Querying Diverse Treebanks in a Uniform Way , 2010, LREC.

[8]  Simonetta Montemagni,et al.  Converting Italian Treebanks: Towards an Italian Stanford Dependency Treebank , 2013, LAW@ACL.

[9]  J. Bresnan Lexical-Functional Syntax , 2000 .

[10]  Slav Petrov,et al.  Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections , 2011, ACL.

[11]  Joakim Nivre,et al.  Bootstrapping a Swedish Treebank Using Cross-Corpus Harmonization and Annotation Projection , 2007 .

[12]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[13]  Andrew Radford,et al.  On the feature composition of participial light verbs in French , 2012 .

[14]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[15]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[16]  René van den Berg Dixon, R.M.W. 2010. Basic Linguistic Theory. Volume 1. Methodology , 2010 .

[17]  M. Trautner,et al.  The Danish Dependency Treebank and the DTAG Treebank Tool , 2003 .

[18]  Janna Lipenkova,et al.  Converting Russian Dependency Treebank to Stanford Typed Dependencies Representation , 2014, EACL.

[19]  Christopher D. Manning,et al.  Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks , 2016, LREC.

[20]  Guglielmo Cinque,et al.  Restructuring and Functional Heads. The Cartography of Syntactic Structures , 2006 .

[21]  Anna Cardinaletti,et al.  Functional Heads: The Cartography of Syntactic Structures, Volume 7 , 2012 .

[22]  Mojgan Seraji,et al.  Uppsala Persian Dependency Treebank : Annotation Guidelines , 2013 .

[23]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[24]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[25]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[26]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.

[27]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[28]  Sampo Pyysalo,et al.  Collaborative development of annotation guidelines with application to Universal Dependencies , 2014 .

[29]  Sampo Pyysalo,et al.  SETS: Scalable and Efficient Tree Search in Dependency Graphs , 2015, HLT-NAACL.

[30]  Rudolf Rosa,et al.  HamleDT 2.0: Thirty Dependency Treebanks Stanfordized , 2014, LREC.

[31]  Daniel Zeman,et al.  HamleDT: To Parse or Not to Parse? , 2012, LREC.

[32]  Timothy Osborne,et al.  A Historical Overview of the Status of Function Words in Dependency Grammar , 2015, DepLing.