Phrase-Based Backoff Models for Machine Translation of Highly Inflected Languages

We propose a backoff model for phrasebased machine translation that translates unseen word forms in foreign-language text by hierarchical morphological abstractions at the word and the phrase level. The model is evaluated on the Europarl corpus for German-English and FinnishEnglish translation and shows improvements over state-of-the-art phrase-based models.

[1]  Hermann Ney,et al.  Morpho-syntactic analysis for reordering in statistical machine translation , 2001, MTSUMMIT.

[2]  Rebecca Hwa,et al.  A Backoff Model for Bootstrapping Resources for Non-English Languages , 2005, HLT/EMNLP.

[3]  Sharon Goldwater,et al.  Improving Statistical MT through Morphological Analysis , 2005, HLT.

[4]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[5]  Philipp Koehn,et al.  Noun phrase translation , 2003 .

[6]  Douglas W. Oard,et al.  Improved Cross-Language Retrieval using Backoff Translation , 2001, HLT.

[7]  Iadh Ounis,et al.  Deploying Part-of-Speech Patterns to Enhance Statistical Phrase-Based Machine Translation Resources , 2005, ParallelText@ACL.

[8]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[9]  Chin-Hui Lee,et al.  Hierarchical class n-gram language models: towards better estimation of unseen events in speech recognition , 2003, INTERSPEECH.

[10]  Daniel Joseph Chair-Morgan Nelson Gildea,et al.  Statistical language understanding using frame semantics , 2001 .

[11]  José B. Mariño,et al.  Improving statistical machine translation by classifying and generalizing inflected verb forms , 2005, INTERSPEECH.

[12]  Alexander M. Fraser,et al.  ISI's Participation in the Romanian-English Alignment Task , 2005, ParallelText@ACL.

[13]  Jeff A. Bilmes,et al.  Factored Language Models and Generalized Parallel Backoff , 2003, NAACL.

[14]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[15]  Hermann Ney,et al.  Toward hierarchical models for statistical machine translation of inflected languages , 2001, DDMMT@ACL.

[16]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[17]  Philipp Koehn,et al.  Shared Task: Statistical Machine Translation between European Languages , 2005, ParallelText@ACL.

[18]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[19]  Michael Gamon,et al.  Normalizing German and English inflectional morphology to improve statistical word alignment , 2004, AMTA.