论文信息 - Multilingual Dependency Parsing: Using Machine Translated Texts instead of Parallel Corpora

Multilingual Dependency Parsing: Using Machine Translated Texts instead of Parallel Corpora

This paper revisits the projection-based approach to dependency grammar induction task. Traditional cross-lingual dependency induction tasks one way or the other, depend on the existence of bitexts or target language tools such as part-of-speech (POS) taggers to obtain reasonable parsing accuracy. In this paper, we transfer dependency parsers using only approximate resources, i.e., machine translated bitexts instead of manually created bitexts. We do this by obtaining the the source side of the text from a machine translation (MT) system and then apply transfer approaches to induce parser for the target languages. We further reduce the need for the availability of labeled target language resources by using unsupervised target tagger. We show that our approach consistently outperforms unsupervised parsers by a bigger margin (8.2% absolute), and results in similar performance when compared with delexicalized transfer parsers.

Zdenek Zabokrtský | Loganathan Ramasamy | David Marecek

[1] Milan Straka,et al. Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing , 2013, ACL.

[2] Daniel Zeman,et al. HamleDT: To Parse or Not to Parse? , 2012, LREC.

[3] Slav Petrov,et al. Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[4] James J. Masanz,et al. LANGUAGE PROCESSING , 1998 .

[5] Dan Klein,et al. Syntactic Transfer Using a Bilingual Lexicon , 2012, EMNLP-CoNLL.

[6] Jakob Uszkoreit,et al. Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure , 2012, NAACL.

[7] Sebastian Riedel,et al. The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[8] David A. Smith,et al. Parser Adaptation and Projection with Quasi-Synchronous Grammar Features , 2009, EMNLP.

[9] Slav Petrov,et al. A Universal Part-of-Speech Tagset , 2011, LREC.

[10] Joakim Nivre,et al. Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[11] Alexander Clark,et al. Inducing Syntactic Categories by Context Distribution Clustering , 2000, CoNLL/LLL.