Parallel Corpora for bi-lingual English-Ethiopian Languages Statistical Machine Translation

In this paper, we describe an attempt towards the development of parallel corpora for English and Ethiopian Languages, such as Amharic, Tigrigna, Afan-Oromo, Wolaytta and Ge’ez. The corpora are used for conducting a bi-directional statistical machine translation experiments. The BLEU scores of the bi-directional Statistical Machine Translation (SMT) systems show a promising result. The morphological richness of the Ethiopian languages has a great impact on the performance of SMT specially when the targets are Ethiopian languages. Now we are working towards an optimal alignment for a bi-directional English-Ethiopian languages SMT.

[1]  Grover Hudson,et al.  Essentials of Amharic , 2007 .

[2]  P. Lewis Ethnologue : languages of the world , 2009 .

[3]  Tariku Tsegaye,et al.  English -Tigrigna Factored Statistical Machine Translation , 2014 .

[4]  Motomichi Wakasa,et al.  A Descriptive Study of the Modern Wolaytta Language , 2008 .

[5]  Eleni Teshome,et al.  Bidirectional English-Amharic Machine Translation: An Experiment using Constrained Corpus , 2013 .

[6]  Million Meshesha,et al.  Experimenting Statistical Machine Translation for Ethiopic Semitic Languages: The Case of Amharic-Tigrigna , 2017, ICT4DA.

[7]  Michael T. Ward Concise history of the language sciences: From the Sumerians to the cognitivists , 1997 .

[8]  Catherine Griefenow-Mewis,et al.  A grammatical sketch of written Oromo , 2001 .

[9]  M. Gasser HornMorpho: a system for morphological processing of Amharic, Oromo, and Tigrinya , 2011 .

[10]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[11]  Akubazgi Gebremariam,et al.  Amharic-to-Tigrigna Machine Translation Using Hybrid Approach , 2017 .

[12]  Sarah L. Nesbeitt Ethnologue: Languages of the World , 1999 .

[13]  Michael Gasser,et al.  A Dependency Grammar for Amharic , 2010 .

[14]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[15]  Sisay Fissaha Adafre Adding Amharic to a Unification-Based Machine Translation System: An Experiment , 2004 .

[16]  Laurent Besacier,et al.  English-Amharic Statistical Machine Translation , 2012 .