New treebank or repurposed? On the feasibility of cross-lingual parsing of Romance languages with Universal Dependencies†

This is the final peer-reviewed manuscript that was accepted for publication in Natural Language Engineering. Changes resulting from the publishing process, such as editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document.

[1]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[2]  Alessandro Moschitti,et al.  Convolution Kernels on Constituent, Dependency and Sequential Structures for Relation Extraction , 2009, EMNLP.

[3]  Tomaz Erjavec,et al.  MULTEXT-East: morphosyntactic resources for Central and Eastern European languages , 2011, Language Resources and Evaluation.

[4]  Lluís Padró,et al.  FreeLing 3.0: Towards Wider Multilinguality , 2012, LREC.

[5]  Jörg Tiedemann,et al.  Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels , 2015, DepLing.

[6]  Jörg Tiedemann,et al.  Treebank Translation for Cross-Lingual Parser Induction , 2014, CoNLL.

[7]  Mohammad Sadegh Rasooli,et al.  Density-Driven Cross-Lingual Transfer of Dependency Parsers , 2015, EMNLP.

[8]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[9]  Mark Steedman,et al.  Example Selection for Bootstrapping Statistical Parsers , 2003, NAACL.

[10]  Ondrej Dusek,et al.  HamleDT: Harmonized multi-language dependency treebank , 2014, Lang. Resour. Evaluation.

[11]  Regina Barzilay,et al.  Selective Sharing for Multilingual Dependency Parsing , 2012, ACL.

[12]  Celso Ferreira da Cunha,et al.  Nova gramática do português contemporâneo , 1985 .

[13]  Daniel Zeman,et al.  Reusable Tagset Conversion Using Tagset Drivers , 2008, LREC.

[14]  Barbara Plank,et al.  Multilingual Projection for Parsing Truly Low-Resource Languages , 2016, TACL.

[15]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[16]  Josef van Genabith,et al.  Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation , 2008, COLING 2008.

[17]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[18]  X. R. F. Mato,et al.  Gramática da lingua galega II: Morfosintaxe , 2000 .

[19]  Miguel A. Alonso,et al.  One model, two languages: training bilingual parsers with harmonized treebanks , 2015, ACL.

[20]  François Yvon,et al.  Zero-resource Dependency Parsing: Boosting Delexicalized Cross-lingual Transfer with Linguistic Knowledge , 2016, COLING.

[21]  Joakim Nivre,et al.  Target Language Adaptation of Discriminative Transfer Parsers , 2013, NAACL.

[22]  Yan Huang,et al.  Anchoring and Agreement in Syntactic Annotations , 2016, EMNLP.

[23]  Slav Petrov,et al.  Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[24]  Dan Klein,et al.  Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[25]  Henrique Monteagudo Romero Plan Xeral de Normalización da Lingua Galega: do bilingüismo hartmónico á oferta positiva , 2004 .

[26]  Trevor Cohn,et al.  A Neural Network Model for Low-Resource Universal Dependency Parsing , 2015, EMNLP.

[27]  David A. Smith,et al.  Parser Adaptation and Projection with Quasi-Synchronous Grammar Features , 2009, EMNLP.

[28]  Philipp Koehn,et al.  Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) , 2007 .

[29]  Regina Barzilay,et al.  Hierarchical Low-Rank Tensors for Multilingual Transfer Parsing , 2015, EMNLP.

[30]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[31]  Noah A. Smith,et al.  Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance , 2011, EMNLP.

[32]  Noah A. Smith,et al.  Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features , 2014, CL.

[33]  Philip Resnik,et al.  Cross-Language Parser Adaptation between Related Languages , 2008, IJCNLP.

[34]  Dirk Hovy,et al.  If all you have is a bit of the Bible: Learning POS taggers for truly low-resource languages , 2015, ACL.

[35]  David Yarowsky,et al.  A Representation Learning Framework for Multi-Source Transfer Parsing , 2016, AAAI.

[36]  Anders Søgaard Data point selection for cross-language adaptation of dependency parsers , 2011, ACL.

[37]  Eduard Bejcek,et al.  Prague Dependency Treebank 2.5 – a Revisited Version of PDT 2.0 , 2012, COLING.

[38]  François Yvon,et al.  Cross-lingual Dependency Transfer : What Matters? Assessing the Impact of Pre- and Post-processing , 2016 .

[39]  Rudolf Rosa,et al.  KLcpos3 - a Language Similarity Measure for Delexicalized Parser Transfer , 2015, ACL.

[40]  Ben Taskar,et al.  Dependency Grammar Induction via Bitext Projection Constraints , 2009, ACL/IJCNLP.

[41]  Philip Resnik,et al.  Bootstrapping parsers via syntactic projection across parallel texts , 2005, Natural Language Engineering.

[42]  Noah A. Smith,et al.  Many Languages, One Parser , 2016, TACL.

[43]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[44]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[45]  Shashi Narayan,et al.  Proceedings of the 24th International Conference on Computational Linguistics (COLING) , 2012, International Conference on Computational Linguistics.

[46]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[47]  Carina Silberer,et al.  Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2013 .

[48]  Jörg Tiedemann Improving the Cross-Lingual Projection of Syntactic Dependencies , 2015, NODALIDA.

[49]  Jörg Tiedemann,et al.  Synthetic Treebanking for Cross-Lingual Dependency Parsing , 2016, J. Artif. Intell. Res..

[50]  Joakim Nivre,et al.  Incrementality in Deterministic Dependency Parsing , 2004 .

[51]  Mark Steedman,et al.  Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together , 2004 .

[52]  Barbara Plank,et al.  Inverted indexing for cross-lingual NLP , 2015, ACL.

[53]  Jörg Tiedemann,et al.  Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets , 2014, EMNLP 2014.

[54]  Rudolf Rosa,et al.  HamleDT 2.0: Thirty Dependency Treebanks Stanfordized , 2014, LREC.

[55]  Yuji Matsumoto MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[56]  David Yarowsky,et al.  Cross-lingual Dependency Parsing Based on Distributed Representations , 2015, ACL.

[57]  François Yvon,et al.  Frustratingly Easy Cross-Lingual Transfer for Transition-Based Dependency Parsing , 2016, NAACL.

[58]  Tony McEnery,et al.  Corpus Linguistics: Method, Theory and Practice , 1996 .

[59]  Rudolf Rosa,et al.  MSTParser Model Interpolation for Multi-Source Delexicalized Transfer , 2015, IWPT.

[60]  Dan Klein,et al.  Syntactic Transfer Using a Bilingual Lexicon , 2012, EMNLP-CoNLL.

[61]  Jakob Uszkoreit,et al.  Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure , 2012, NAACL.

[62]  Pablo Gamallo,et al.  Yet Another Suite of Multilingual NLP Tools , 2015, SLATE.

[63]  Segismundo Spina,et al.  História da língua portuguesa , 2008 .

[64]  Eric Laporte,et al.  UNITEX-PB, a set of flexible language resources for Brazilian Portuguese , 2005 .

[65]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[66]  Jörg Tiedemann,et al.  Rediscovering Annotation Projection for Cross-Lingual Parser Induction , 2014, COLING.

[67]  José Ramom Pichel Campos,et al.  Vencendo a escassez de recursos computacionais. Carvalho: Tradutor Automático Estatístico Inglês-Galego a partir do corpus paralelo Europarl Inglês-Português , 2010, Linguamática.

[68]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[69]  Beáta Megyesi,et al.  Proceedings of the 20th Nordic Conference of Computational Linguistics , 2015 .

[70]  Jan Hajic,et al.  UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing , 2016, LREC.

[71]  Chengqing Zong,et al.  Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) , 2015, IJCNLP 2015.

[72]  Tiago Vidal Fugueiroa Estructuras fonéticas de tres dialectos de Vigo , 1997 .

[73]  Reut Tsarfaty,et al.  A Unified Morpho-Syntactic Scheme of Stanford Dependencies , 2013, ACL.