VAASAANUBAADA: automatic machine translation of bilingual Bengali-Assamese news texts

This paper presents a project for translating bilingual Bengali-Assamese news texts using an example-based machine translation technique. The work involves machine translation of bilingual texts at sentence level. In addition, the work also includes preprocessing and post-processing tasks. The work is unique because of the language pair that is chosen for experimentation. We constructed and aligned the bilingual corpus manually by feeding real examples using pseudo code. The longer input sentence is fragmented at punctuations, which resulted in high quality translation. Backtracking is used when an exact match is not found at the sentence/fragment level, leading to further fragmentation of the sentence. Since bilingual Bengali-Assamese languages belong to the Magadha Prakrit group, the grammatical form of sentences is very similar and has no lexical word groups. The results when tested are fascinating with quality translation.