Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. The annotation consists in a linguistically motivated word segmentation; a morphological layer comprising lemmas, universal part-of-speech tags, and standardized morphological features; and a syntactic layer focusing on syntactic relations between predicates, arguments and modifiers. In this paper, we describe version 2 of the guidelines (UD v2), discuss the major changes from UD v1 to UD v2, and give an overview of the currently available treebanks for 90 languages.

[1]  Christopher D. Manning,et al.  Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks , 2016, LREC.

[2]  Daniel Zeman,et al.  Coordination Structures in Dependency Treebanks , 2013, ACL.

[3]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[4]  Sandra A. Thompson,et al.  Discourse Motivations for the Core-Oblique Distinction as a Language Universal , 1997 .

[5]  Joakim Nivre,et al.  Enhancing Universal Dependency Treebanks: A Case Study , 2018, UDW@EMNLP.

[6]  Nizar Habash,et al.  CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2017, CoNLL.

[7]  John Robert Ross,et al.  Constraints on variables in syntax , 1967 .

[8]  A. Andrews The major functions of the noun phrase , 2007 .

[9]  Daniel Zeman,et al.  Towards Deep Universal Dependencies , 2019, Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019).

[10]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[11]  Daniel Zeman,et al.  Reusable Tagset Conversion Using Tagset Drivers , 2008, LREC.

[12]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[13]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[14]  Christo Kirov,et al.  A Language-Independent Feature Schema for Inflectional Morphology , 2015, ACL.

[15]  Sylvain Kahane,et al.  Non-constituent coordination and other coordinative constructions as Dependency Graphs , 2015, DepLing.

[16]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[17]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[18]  Martin Potthast,et al.  CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018, CoNLL.