Can SMT and RBMT Improve each other’s Performance?- An Experiment with English-Hindi Translation

Rule-based machine translation (RBMT) and Statistical machine translation (SMT) are two well-known approaches for translation which have their own benefits. System architecture of SMT often complements RBMT, and the vice-versa. In this paper, we propose an effective method of serial coupling where we attempt to build a hybrid model that exploits the benefits of both the architectures. The first part of coupling is used to obtain good lexical selection and robustness, second part is used to improve syntax and the final one is designed to combine other modules along with the best phrase reordering. Our experiments on a English-Hindi product domain dataset show the effectiveness of the proposed approach with improvement in BLEU score.

[1]  H. W. Xuan,et al.  An Advanced Review of Hybrid Machine Translation (HMT) , 2012 .

[2]  G. Thurmair Comparing different architectures of hybrid Machine Translation systems , 2009, MTSUMMIT.

[3]  Robert J. Gaizauskas,et al.  A Hybrid Approach to Align Sentences and Words in English-Hindi Parallel Corpora , 2005, ParallelText@ACL.

[4]  Pushpak Bhattacharyya,et al.  Interlingua-based English–Hindi Machine Translation and Language Divergence , 2001, Machine Translation.

[5]  Roland Kuhn,et al.  Rule-Based Translation with Statistical Phrase-Based Post-Editing , 2007, WMT@ACL.

[6]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[7]  LTRC,et al.  Coupling Statistical Machine Translation with Rule-based Transfer and Generation , 2010 .

[8]  José A. R. Fonollosa,et al.  Latest trends in hybrid machine translation and its applications , 2015, Comput. Speech Lang..

[9]  Francis M. Tyers,et al.  Apertium: a free/open-source platform for rule-based machine translation , 2011, Machine Translation.

[10]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[11]  Doug Arnold,et al.  Machine Translation: An Introductory Guide , 1994 .

[12]  K. P. Soman,et al.  Rule based Sentence Simplification for English to Tamil Machine Translation System , 2011 .

[13]  A. Jain,et al.  ANGLABHARTI: a multilingual machine aided translation project on translation from English to Indian languages , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[14]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[15]  Philip Resnik,et al.  Proceedings of the ACL Workshop on Building and Using Parallel Texts , 2005 .

[16]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[17]  Pushpak Bhattacharyya,et al.  Simple Syntactic and Morphological Processing Can Help English-Hindi Statistical Machine Translation , 2008, IJCNLP.

[18]  Sanjay K. Dwivedi,et al.  Machine Translation System in Indian Perspectives , 2010 .

[19]  C. Federmann,et al.  Hybrid Architectures for Multi-Engine Machine Translation , 2008, TC.

[20]  Pushpak Bhattacharyya,et al.  Case markers and Morphology: Addressing the crux of the fluency problem in English-Hindi SMT , 2009, ACL.

[21]  Manny Rayner,et al.  Hybrid language processing in the Spoken Language Translator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Christof Monz,et al.  Improving Statistical Machine Translation Performance by Oracle-BLEU Model Re-estimation , 2016, ACL.

[23]  Philipp Koehn Introduction to statistical machine translation , 2004, AMTA.

[24]  Karthik Gali,et al.  Modeling Machine Transliteration as a Phrase Based Statistical Machine Translation Problem , 2009, NEWS@IJCNLP.

[25]  Ajai Kumar Jain,et al.  AnglaHindi: an English to Hindi machine-aided translation system , 2003, MTSUMMIT.

[26]  Andy Way,et al.  Handling Named Entities and Compound Verbs in Phrase-Based Statistical Machine Translation , 2010, MWE@COLING.