Multi-source Cross-lingual Delexicalized Parser Transfer: Prague or Stanford?

We compare two annotation styles, Prague dependencies and Universal Stanford Dependencies, in their adequacy for parsing. We specifically focus on comparing the adposition attachment style, used in these two formalisms, applied in multisource cross-lingual delexicalized dependency parser transfer performed by parse tree combination. We show that in our setting, converting the adposition annotation to Stanford style in the Prague style training treebanks leads to promising results. We find that best results can be obtained by parsing the target sentences with parsers trained on treebanks using both of the adposition annotation styles in parallel, and combining all the resulting parse trees together after having converted them to the Stanford adposition style (+0.39% UAS over Prague style baseline). The score improvements are considerably more significant when using a smaller set of diverse source treebanks (up to +2.24% UAS over the baseline).

[1]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[2]  Regina Barzilay,et al.  Selective Sharing for Multilingual Dependency Parsing , 2012, ACL.

[3]  Alon Lavie,et al.  Parser Combination by Reparsing , 2006, NAACL.

[4]  Rudolf Rosa,et al.  KLcpos3 - a Language Similarity Measure for Delexicalized Parser Transfer , 2015, ACL.

[5]  Roy Schwartz,et al.  Learnability-Based Syntactic Annotation Design , 2012, COLING.

[6]  Anders Søgaard,et al.  An Empirical Etudy of Non-Lexical Extensions to Delexicalized Transfer , 2012, COLING.

[7]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[8]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[9]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[10]  Joakim Nivre,et al.  Target Language Adaptation of Discriminative Transfer Parsers , 2013, NAACL.

[11]  Rudolf Rosa MSTperl parser (2015-05-19) , 2015 .

[12]  Daniel Zeman,et al.  Reusable Tagset Conversion Using Tagset Drivers , 2008, LREC.

[13]  Stephan Oepen,et al.  Survey on parsing three dependency representations for English , 2013, ACL.

[14]  Philip Resnik,et al.  Cross-Language Parser Adaptation between Related Languages , 2008, IJCNLP.

[15]  Anders Søgaard An Empirical Study of Differences between Conversion Schemes and Annotation Guidelines , 2013, DepLing.

[16]  Rudolf Rosa MSTperl delexicalized parser transfer scripts and configuration files , 2015 .

[17]  Y. Singer,et al.  Ultraconservative online algorithms for multiclass problems , 2003 .

[18]  Daniel Zeman,et al.  HamleDT: To Parse or Not to Parse? , 2012, LREC.

[19]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[20]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[21]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[22]  Slav Petrov,et al.  Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[23]  Rudolf Rosa,et al.  HamleDT 2.0: Thirty Dependency Treebanks Stanfordized , 2014, LREC.