Attention based English-Bodo Neural Machine Translation System for Tourism Domain

Bodo language is a relatively low resource language. Other than the text-book, novels and some print publication of newspaper, there appears to be very few resources available in the public domain. As the technology becomes affordable there is a growing number of active Bodo internet users. It requires a technology that can bring information in their own language. Machine translation appears to be a promising solution for that purpose. In this work we build an English-Bodo Neural Machine Translation by adopting a two layered bidirectional Long Short Term Memory (LSTM) cells that can capture the long term dependencies. As very few work has been done on English-Bodo NMT, we make our baseline model which produced a BLEU Score of 11.8 . We then gradually overcome the baseline model by introducing several attention mechanism. We achieved a BLEU Score of 16.71 using the approach presented in Bahdanu. Furthermore we got a better BLEU score of 17.9 when we introduced beam search with a beam width of 5. We found that the model performs very well despite the few dataset available.