Comparison of SMT and RBMT; The Requirement of Hybridization for Marathi-Hindi MT

We present in this paper our work on comparison between Statistical Machine Translation (SMT) and Rule-based machine translation for translation from Marathi to Hindi. Rule Based systems although robust take lots of time to build. On the other hand statistical machine translation systems are easier to create, maintain and improve upon. We describe the development of a basic Marathi-Hindi SMT system and evaluate its performance. Through a detailed error analysis, we, point out the relative strengths and weaknesses of both systems. Effectively, we shall see that even with a small amount of training corpus a statistical machine translation system has many advantages for high quality domain specific machine translation over that of a rule-based counterpart.

[1]  Antony P. J.,et al.  Machine Translation Approaches and Survey for Indian Languages , 2013, ROCLING/IJCLCLP.

[2]  Pushpak Bhattacharyya,et al.  Interlingua-based English–Hindi Machine Translation and Language Divergence , 2001, Machine Translation.

[3]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[4]  Rajeev Sangal,et al.  Coupling Statistical Machine Translation with Rule-based Transfer and Generation , 2010, AMTA.

[5]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[6]  Pushpak Bhattacharyya,et al.  Clause-Based Reordering Constraints to Improve Statistical Machine Translation , 2011, IJCNLP.

[7]  Pushpak Bhattacharyya,et al.  Processing of Kridanta (Participle) in Marathi , 2011 .

[8]  Pushpak Bhattacharyya,et al.  Partially modelling word reordering as a sequence labelling problem , 2012, SMT@COLING.

[9]  Hermann Ney,et al.  Statistical multi-source translation , 2001, MTSUMMIT.

[10]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[11]  Pushpak Bhattacharyya,et al.  Morphological Analyzer for Affix Stacking Languages: A Case Study of Marathi , 2012, COLING.

[12]  Latha R. Nair,et al.  Machine Translation Systems for Indian Languages , 2012 .

[13]  Bonnie J. Dorr,et al.  Machine Translation Divergences: A Formal Description and Proposed Solution , 1994, CL.

[14]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15]  P. J. Antony,et al.  Machine Translation Approaches and Survey for Indian Languages , 2013, ROCLING/IJCLCLP.