Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers

Current methods of cross-lingual parser transfer focus on predicting the best parser for a low-resource target language globally, that is, "at treebank level". In this work, we propose and argue for a novel cross-lingual transfer paradigm: instance-level parser selection (ILPS), and present a proof-of-concept study focused on instance-level selection in the framework of delexicalized parser transfer. We start from an empirical observation that different source parsers are the best choice for different Universal POS sequences in the target language. We then propose to predict the best parser at the instance level. To this end, we train a supervised regression model, based on the Transformer architecture, to predict parser accuracies for individual POS-sequences. We compare ILPS against two strong single-best parser selection baselines (SBPS): (1) a model that compares POS n-gram distributions between the source and target languages (KL) and (2) a model that selects the source based on the similarity between manually created language vectors encoding syntactic properties of languages (L2V). The results from our extensive evaluation, coupling 42 source parsers and 20 diverse low-resource test languages, show that ILPS outperforms KL and L2V on 13/20 and 14/20 test languages, respectively. Further, we show that by predicting the best parser "at the treebank level" (SBPS), using the aggregation of predictions from our instance-level model, we outperform the same baselines on 17/20 and 16/20 test languages.

[1]  Gosse Bouma,et al.  UDapter: Language Adaptation for Truly Universal Dependency Parsing , 2020, EMNLP.

[2]  Goran Glavas,et al.  Is Supervised Syntactic Parsing Beneficial for Language Understanding Tasks? An Empirical Investigation , 2020, EACL.

[3]  Anders Søgaard Data point selection for cross-language adaptation of dependency parsers , 2011, ACL.

[4]  Goran Glavas,et al.  From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers , 2020, ArXiv.

[5]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[6]  Jungo Kasai,et al.  Low-Resource Parsing with Crosslingual Contextualized Representations , 2019, CoNLL.

[7]  Noah A. Smith,et al.  Many Languages, One Parser , 2016, TACL.

[8]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[9]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Zeljko Agic,et al.  Cross-Lingual Parser Selection for Low-Resource Languages , 2017, UDW@NoDaLiDa.

[12]  Yijia Liu,et al.  Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing , 2019, EMNLP.

[13]  Timothy Dozat,et al.  Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task , 2017, CoNLL.

[14]  David Yarowsky,et al.  Cross-lingual Dependency Parsing Based on Distributed Representations , 2015, ACL.

[15]  François Yvon,et al.  Frustratingly Easy Cross-Lingual Transfer for Transition-Based Dependency Parsing , 2016, NAACL.

[16]  Graham Neubig,et al.  Choosing Transfer Languages for Cross-Lingual Learning , 2019, ACL.

[17]  Jörg Tiedemann,et al.  Synthetic Treebanking for Cross-Lingual Dependency Parsing , 2016, J. Artif. Intell. Res..

[18]  Anders Søgaard,et al.  An Empirical Etudy of Non-Lexical Extensions to Delexicalized Transfer , 2012, COLING.

[19]  Jason Eisner,et al.  Surface Statistics of an Unknown Language Indicate How to Parse It , 2018, TACL.

[20]  Barbara Plank,et al.  Effective Measures of Domain Similarity for Parsing , 2011, ACL.

[21]  Ivan Vulic,et al.  Survey on the Use of Typological Information in Natural Language Processing , 2016, COLING.

[22]  Regina Barzilay,et al.  Hierarchical Low-Rank Tensors for Multilingual Transfer Parsing , 2015, EMNLP.

[23]  Rudolf Rosa,et al.  KLcpos3 - a Language Similarity Measure for Delexicalized Parser Transfer , 2015, ACL.

[24]  Slav Petrov,et al.  Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[25]  Patrick Littell,et al.  URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors , 2017, EACL.

[26]  Regina Barzilay,et al.  Selective Sharing for Multilingual Dependency Parsing , 2012, ACL.

[27]  Zhang Yue,et al.  Cross-Lingual Dependency Parsing Using Code-Mixed TreeBank , 2019, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).

[28]  Fei Xia,et al.  Unsupervised Dependency Parsing with Transferring Distribution via Parallel Guidance and Entropy Regularization , 2014, ACL.

[29]  Daniel Kondratyuk,et al.  75 Languages, 1 Model: Parsing Universal Dependencies Universally , 2019, EMNLP.

[30]  Jason Eisner,et al.  The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages , 2016, TACL.

[31]  Mark Steedman,et al.  Data Augmentation via Dependency Tree Morphing for Low-Resource Languages , 2018, EMNLP.

[32]  Mohammad Sadegh Rasooli,et al.  Density-Driven Cross-Lingual Transfer of Dependency Parsers , 2015, EMNLP.

[33]  Jason Eisner,et al.  Synthetic Data Made to Order: The Case of Parsing , 2018, EMNLP.

[34]  Monojit Choudhury,et al.  The State and Fate of Linguistic Diversity and Inclusion in the NLP World , 2020, ACL.

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Nanyun Peng,et al.  Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing , 2019, EMNLP.

[37]  Joakim Nivre,et al.  Target Language Adaptation of Discriminative Transfer Parsers , 2013, NAACL.

[38]  Philip Resnik,et al.  Cross-Language Parser Adaptation between Related Languages , 2008, IJCNLP.

[39]  Anna Korhonen,et al.  Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP , 2018, ACL.

[40]  Alon Lavie,et al.  Parser Combination by Reparsing , 2006, NAACL.

[41]  Chris Dyer,et al.  Syntactic Structure Distillation Pretraining for Bidirectional Encoders , 2020, Transactions of the Association for Computational Linguistics.

[42]  Thierry Poibeau,et al.  Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing , 2018, Computational Linguistics.

[43]  Anders Søgaard,et al.  Joint part-of-speech and dependency projection from multiple sources , 2016, ACL.

[44]  Mohammad Sadegh Rasooli,et al.  Cross-Lingual Syntactic Transfer with Limited Resources , 2017, Transactions of the Association for Computational Linguistics.

[45]  Adam Lopez,et al.  A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages , 2019, EMNLP/IJCNLP.