Statistical machine translation of systems for Sinhala - Tamil

One of the most promising and leading machine translation strategies would be Statistical Translation Approach. Being pertinent even to structurally dissimilar language pairs, it has confirmed its suitability for large text translation. Rising demand is present for automatic translation between Sinhala and Tamil for quite a lot of decades. Statistical approach is the best preference to resolve the unavailability of a machine translation tool for the languages concerned. Because of language similarity, statistical approach could thrive agreeably, exclusive of more concern on linguistic knowledge. A basic translation system has been modelled and implemented in this research, with the preparation of parallel corpora from parliament order papers. This paper demonstrates only the preliminary system runs of the research, devoid of various parameter refinements and actual design and evaluation strategies. Language Model, Translation Model and Decoder Configurations are done consistent with recent literature. To facilitate the improvement of output quality, MERT technique is integrated to tune the decoder. To stay away from sole dependence on BLEU, two other automatic metrics namely TER and NIST are utilised for the evaluation in different aspects. In addition, directions to future research are also recognized and specified for the refinements of this system.