Survey of Uralic Universal Dependencies development

This paper attempts to evaluate some of the systematic differences in Uralic Universal Dependencies treebanks from a perspective that would help to introduce reasonable improvements in treebank annotation consistency within this language family. The study finds that the coverage of Uralic languages in the project is already relatively high, and the majority of typically Uralic features are already present and can be discussed on the basis of existing treebanks. Some of the idiosyncrasies found in individual treebanks stem from language-internal grammar traditions, and could be a target for harmonization in later phases.

[1]  Mika Hämäläinen,et al.  UralicNLP: An NLP Library for Uralic Languages , 2019, J. Open Source Softw..

[2]  Thierry Poibeau,et al.  The First Komi-Zyrian Universal Dependencies Treebanks , 2018, UDW@EMNLP.

[3]  Veronika Vincze,et al.  Universal Dependencies and Morphology for Hungarian - and on the Price of Universality , 2017, EACL.

[4]  Kadri Muischnek,et al.  Estonian Dependency Treebank: from Constraint Grammar tagset to Universal Dependencies , 2016, LREC.

[5]  Joshua Wilbur,et al.  Utilizing Language Technology in the Documentation of Endangered Uralic Languages , 2016 .

[6]  Veronika Laippala,et al.  Universal Dependencies for Finnish , 2015, NODALIDA.

[7]  Boglárka Janurik The emergence of gender agreement in code-switched verbal constructions in Erzya-Russian bilingual discourse , 2015 .

[8]  Tapio Salakoski,et al.  Building the essential resources for Finnish: the Turku Dependency Treebank , 2013, Language Resources and Evaluation.

[9]  Tommi A Pirinen,et al.  Building minority dependency treebanks, dictionaries and computational grammars at the same time—an experiment in Karelian treebanking , 2019, Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019).

[10]  Mika Hämäläinen,et al.  Advances in synchronized XML-MediaWiki dictionary development in the context of endangered Uralic languages , 2018 .

[11]  Francis M. Tyers,et al.  Towards an open-source universal-dependency treebank for Erzya , 2018 .

[12]  Veronika Vincze,et al.  Language technology resources and tools for Mansi: an overview , 2017 .

[13]  Francis M. Tyers,et al.  Annotation schemes in North Sámi dependency parsing , 2017 .

[14]  Çağrı Çöltekin,et al.  ( When ) do we need inflectional groups ? , 2022 .