First automatic fongbe continuous speech recognition system: Development of acoustic models and language models

This paper reports our efforts toward an ASR system for a new under-resourced language (Fongbe). The aim of this work is to build acoustic models and language models for continuous speech decoding in Fongbe. The problem encountered with Fongbe (an African language spoken especially in Benin, Togo, and Nigeria) is that it does not have any language resources for an ASR system. As part of this work, we have first collected Fongbe text and speech corpora that are described in the following sections. Acoustic modeling has been worked out at a graphemic level and language modeling has provided two language models for performance comparison purposes. We also performed a vowel simplification by removing tones diacritics in order to investigate their impact on the language models.

[1]  Etienne Barnard,et al.  Wolof Speech Recognition Model of Digits and Limited-Vocabulary Based on HMM and ToolKit , 2012, 2012 UKSim 14th International Conference on Computer Modelling and Simulation.

[2]  S. Matsushita,et al.  Languages of Africa , 1981 .

[3]  Aleksander Smywinski-Pohl,et al.  Comparison of language models trained on written texts and speech transcripts in the context of automatic speech recognition , 2015, 2015 Federated Conference on Computer Science and Information Systems (FedCSIS).

[4]  J. Xu,et al.  Audio Indexing of Arabic broadcast news , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  H. H,et al.  THE LANGUAGES OF AFRICA. , 1884, Science.

[6]  Lukasz Laszko Word detection in recorded speech using textual queries , 2015, 2015 Federated Conference on Computer Science and Information Systems (FedCSIS).

[7]  Laurent Besacier,et al.  Which units for acoustic and language modeling for Khmer automatic speech recognition? , 2008, SLTU.

[8]  M. Hariharan,et al.  A review of Yorùbá Automatic Speech Recognition , 2013, 2013 IEEE 3rd International Conference on System Engineering and Technology.

[9]  Tanja Schultz Rapid language adaptation tools and technologies for multilingual speech processing systems , 2008, SLTU.

[10]  Tanja Schultz,et al.  Automatic speech recognition for under-resourced languages: A survey , 2014, Speech Commun..

[11]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[12]  George Saon,et al.  Feature and model space speaker adaptation with full covariance Gaussians , 2006, INTERSPEECH.

[13]  Martine Adda-Decker,et al.  Parallel Speech Collection for Under-resourced Language Studies Using the Lig-Aikuma Mobile Device App , 2016, SLTU.

[14]  Laurent Besacier,et al.  Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of Wolof , 2016, LREC.