Using English as a Pivot Language to Enhance Danish-Arabic Statistical Machine Translation

We inspect two pivot strategies for Danish-Arabic statistical machine translation (SMT) system; phrase translation pivot strategy and sentence translation pivot strategy respectively. English is used as a pivot language. We develop two SMT systems, DanishEnglish and English-Arabic. We use different English-Arabic and English-Danish data resources. Our final results show that SMT systems developed under sentence based pivot strategy outperforms system developed under phrase based pivot strategy, especially when common parallel corpora are not available.

[1]  Jun Hu,et al.  Improving Arabic-Chinese Statistical Machine Translation using English as Pivot Language , 2009, WMT@EACL.

[2]  Jakob Elming Syntactic Reordering Integrated with Phrase-Based SMT , 2008, COLING.

[3]  James R. Glass,et al.  Segmentation for English-to-Arabic Statistical Machine Translation , 2008, ACL.

[4]  Hua Wu,et al.  Pivot language approach for phrase-based statistical machine translation , 2007, ACL.

[5]  Shankar Kumar,et al.  Improving Word Alignment with Bridge Languages , 2007, EMNLP.

[6]  Hitoshi Isahara,et al.  A Comparison of Pivot Methods for Phrase-Based Statistical Machine Translation , 2007, NAACL.

[7]  Nizar Habash,et al.  Combination of Arabic Preprocessing Schemes for Statistical Machine Translation , 2006, ACL.

[8]  Yaser Al-Onaizan,et al.  Distortion Models for Statistical Machine Translation , 2006, ACL.

[9]  Philipp Koehn,et al.  Improved Statistical Machine Translation Using Paraphrases , 2006, NAACL.

[10]  Dragos Stefan Munteanu,et al.  Improving Machine Translation Performance by Exploiting Non-Parallel Corpora , 2005, CL.

[11]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.

[12]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[13]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[14]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15]  Philipp Koehn,et al.  462 Machine Translation Systems for Europe , 2009, MTSUMMIT.

[16]  Marcello Federico,et al.  Phrase-based statistical machine translation with pivot languages. , 2008, IWSLT.

[17]  Serge Sharoff,et al.  Translating from under-resourced languages: comparing direct transfer against pivot translation , 2007, MTSUMMIT.

[18]  A. Gispert,et al.  Catalan-English Statistical Machine Translation without Parallel Corpus: Bridging through Spanish , 2006 .

[19]  Philipp Koehn,et al.  Explorer Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation , 2005 .