Proceedings of the Second International Workshop on Computational Linguistics for Uralic Languages

In this survey we apply the methodology of [1] to the Uralic family with the specific goal of triage, to help the community decide where the effort is best placed. As in balefield triage, where the relatively lightly wounded and the very heavily wounded are treated last, here we suggest to direct the very limited resources of the computational linguistics community towards the middle class of borderline languages where neither vital nor still/heritage status can be established. e talk will complement from the digital perspective the survey of [2].

[1]  Heli Uibo,et al.  Oahpa! Õpi! Opiq! Developing free online programs for learning Estonian and Võro , 2015 .

[2]  Nadir Durrani,et al.  A Joint Sequence Translation Model with Integrated Reordering , 2011, ACL.

[3]  Joakim Nivre,et al.  MaltOptimizer: An Optimization Tool for MaltParser , 2012, EACL.

[4]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[5]  M. Dunn,et al.  Structural Phylogeny in Historical Linguistics: Methodological Explorations Applied in Island Melanesia , 2008 .

[6]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[7]  Antonio Toral,et al.  Abu-MaTran at WMT 2015 Translation Task: Morphological Segmentation and Web Crawling , 2015, WMT@EMNLP.

[8]  C. Holden,et al.  Bantu language trees reflect the spread of farming across sub-Saharan Africa: a maximum-parsimony analysis , 2002, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[9]  Eckhard Bick,et al.  CG-3 - Beyond Classical Constraint Grammar , 2015, NODALIDA.

[10]  Lene Antonsen,et al.  Interactive pedagogical programs based on constraint grammar , 2009, NODALIDA.

[11]  Atro Voutilainen,et al.  A language-independent system for parsing unrestricted text , 1995 .

[12]  Kalle Korhonen,et al.  Shedding more light on language classification using basic vocabularies and phylogenetic methods: A case study of Uralic , 2013 .

[13]  Philipp Koehn,et al.  Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[14]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[15]  Christian Federmann Appraise: An Open-Source Toolkit for Manual Evaluation of Machine Translation Output , 2012 .

[16]  Heiki-Jaan Kaalep,et al.  An Estonian Morphological Analyser and the Impact of a Corpus on Its Development , 1997, Comput. Humanit..

[17]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[18]  V. Moulton,et al.  Neighbor-net: an agglomerative method for the construction of phylogenetic networks. , 2002, Molecular biology and evolution.

[19]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[20]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[21]  Kadri Muischnek,et al.  Dependency Parsing of Estonian: Statistical and Rule-based Approaches , 2014, Baltic HLT.

[22]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[23]  Eckhard Bick LingPars, a Linguistically Inspired, Language-Independent Machine Learner for Dependency Treebanks , 2006, CoNLL.

[24]  K. Muischnek,et al.  Estonian Particle Verbs And Their Syntactic Analysis , 2013 .

[25]  Matt Post,et al.  Efficient Elicitation of Annotations for Human Evaluation of Machine Translation , 2014, WMT@ACL.

[26]  Philipp Koehn,et al.  Explorer Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation , 2005 .

[27]  Ashish Vaswani,et al.  Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.

[28]  Claire Bowern,et al.  Computational phylogenetics and the internal structure of Pama-Nyungan , 2012 .

[29]  Alon Lavie,et al.  Combining Machine Translation Output with Open Source: The Carnegie Mellon Multi-Engine Machine Translation Scheme , 2010, Prague Bull. Math. Linguistics.

[30]  Ebru Arisoy,et al.  Unsupervised segmentation of words into morphemes - Challenge 2005, An Introduction and Evaluation Report , 2006 .

[31]  Tommi A. Pirinen,et al.  Omorfi — Free and open source morphological lexical database for Finnish , 2015, NODALIDA.

[32]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[33]  Lene Antonsen,et al.  Constraint Grammar in Dialogue Systems , 2009 .

[34]  Johannes Dellert,et al.  Compiling the Uralic Dataset for NorthEuraLex, a Lexicostatistical Database of Northern Eurasia , 2015 .

[35]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[36]  Johann-Mattis List,et al.  Networks of lexical borrowing and lateral gene transfer in language and genome evolution , 2013, BioEssays : news and reviews in molecular, cellular and developmental biology.