论文信息 - Czech-English Dependency Tree-based Machine Translation

Czech-English Dependency Tree-based Machine Translation

We present some preliminary results of a Czech-English translation system based on dependency trees. The fully automated process includes: morphological tagging, analytical and tectogrammatical parsing of Czech, tectogrammatical transfer based on lexical substitution using word-to-word translation dictionaries enhanced by the information from the English-Czech parallel corpus of WSJ, and a simple rule-based system for generation from English tectogrammatical representation. In the evaluation part, we compare results of the fully automated and the manually annotated processes of building the tectogrammatical representation.1

Martin Cmejrek | Jan Curín | Jirí Havelka

[1] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[2] Daniel Marcu,et al. Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[3] Hermann Ney,et al. Improved Statistical Alignment Models , 2000, ACL.

[4] Ding Yuan,et al. Natural language generation in the context of machine translation , 2002 .

[5] Petr Sgall,et al. A MANUAL FOR TECTOGRAMMATICAL TAGGING OF THE PRAGUE DEPENDENCY TREEBANK , 2000 .

[6] Alena Böhmová. Automatic Procedures in Tectogrammatical Tagging , 2001, Prague Bull. Math. Linguistics.

[7] Jan Hajic,et al. Tagging Inflective Languages: Prediction of Morphological Categories for a Rich Structured Tagset , 1998, ACL.

[8] Saso Dzeroski,et al. A Machine Learning Approach to Automatic Functor Assignment in the Prague Dependency Treebank , 2002, LREC.

[9] John A. Carroll,et al. Applied morphological processing of English , 2001, Natural Language Engineering.