论文信息 - Amharic-English Speech Translation in Tourism Domain

Amharic-English Speech Translation in Tourism Domain

This paper describes speech translation from Amharic-to-English, particularly Automatic Speech Recognition (ASR) with post-editing feature and Amharic-English Statistical Machine Translation (SMT). ASR experiment is conducted using morpheme language model (LM) and phoneme acoustic model(AM). Likewise,SMT conducted using word and morpheme as unit. Morpheme based translation shows a 6.29 BLEU score at a 76.4% of recognition accuracy while word based translation shows a 12.83 BLEU score using 77.4% word recognition accuracy. Further, after post-edit on Amharic ASR using corpus based n-gram, the word recognition accuracy increased by 1.42%. Since post-edit approach reduces error propagation, the word based translation accuracy improved by 0.25 (1.95%) BLEU score. We are now working towards further improving propagated errors through different algorithms at each unit of speech translation cascading component.

Laurent Besacier | Million Meshesha | Michael Melese Woldeyohannis | L. Besacier | Million Meshesha

[1] Solomon Teferra Abate,et al. An Amharic speech corpus for large vocabulary continuous speech recognition , 2005, INTERSPEECH.

[2] R. A. S. PAGET. A World Language , 1943, Nature.

[3] Sarah L. Nesbeitt. Ethnologue: Languages of the World , 1999 .

[4] Mikko Kurimo,et al. Morfessor 2.0: Toolkit for statistical morphological segmentation , 2014, EACL.

[5] P. Lewis. Ethnologue : languages of the world , 2009 .

[6] Martine Adda-Decker,et al. Parallel Speech Collection for Under-resourced Language Studies Using the Lig-Aikuma Mobile Device App , 2016, SLTU.

[7] Laurent Besacier,et al. Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of Wolof , 2016, LREC.

[8] Bowen Zhou,et al. IBM MASTOR SYSTEM: Multilingual Automatic Speech-to-Speech Translator , 2006 .

[9] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .

[10] Laurent Besacier,et al. Amharic Speech Recognition for Speech Translation , 2016 .

[11] Christian Boitet,et al. ASR and Translation for Under-Resourced Languages , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12] Solomon Teferra Abate,et al. Effect of language resources on automatic speech recognition for Amharic , 2015, AFRICON 2015.

[13] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[14] Masaaki Honda,et al. Human Speech Production Mechanisms , 2003 .

[15] Tomio Takara,et al. Development of an Amharic Text-to-Speech System Using Cepstral Method , 2009 .

[16] Jacques Klein,et al. A generic weaver for supporting product lines , 2008, EA '08.

[17] Adam Kilgarriff,et al. of the European Chapter of the Association for Computational Linguistics , 2006 .