论文信息 - Statistical machine translation of systems for Sinhala - Tamil

Statistical machine translation of systems for Sinhala - Tamil

One of the most promising and leading machine translation strategies would be Statistical Translation Approach. Being pertinent even to structurally dissimilar language pairs, it has confirmed its suitability for large text translation. Rising demand is present for automatic translation between Sinhala and Tamil for quite a lot of decades. Statistical approach is the best preference to resolve the unavailability of a machine translation tool for the languages concerned. Because of language similarity, statistical approach could thrive agreeably, exclusive of more concern on linguistic knowledge. A basic translation system has been modelled and implemented in this research, with the preparation of parallel corpora from parliament order papers. This paper demonstrates only the preliminary system runs of the research, devoid of various parameter refinements and actual design and evaluation strategies. Language Model, Translation Model and Decoder Configurations are done consistent with recent literature. To facilitate the improvement of output quality, MERT technique is integrated to tune the decoder. To stay away from sole dependence on BLEU, two other automatic metrics namely TER and NIST are utilised for the evaluation in different aspects. In addition, directions to future research are also recognized and specified for the refinements of this system.

[1] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[2] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[3] Joel D. Martin,et al. Improving Translation Quality by Discarding Most of the Phrasetable , 2007, EMNLP.

[4] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[5] Miles Osborne,et al. Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[6] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[7] R. Weerasinghe. A Statistical Machine Translation Approach to Sinhala-Tamil Language Translation , 2003 .

[8] Philipp Koehn,et al. (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[9] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[10] Deyi Xiong,et al. Efficient Beam Thresholding for Statistical Machine Translation , 2009, MTSUMMIT.

[11] George R. Doddington,et al. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[12] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13] Philip Koehn,et al. Statistical Machine Translation , 2010, EAMT.