Toward an Automatic Fongbe Speech Recognition System: Hierarchical Mixtures of Algorithms for Phoneme Recognition

In this paper, we have demonstrated the efficacy of an automatic continuous speech recognition system by mixing fuzzy and neuronal approaches and an acoustic analysis of the sounds of an under-resourced language. The system we propose integrates the modules such as extraction module, segmentation and phoneme recognition modules and whose the core is based on the phoneme detection in continuous speech. This work offers a complete recipe of algorithms to perform hierarchically the following tasks: speech segmentation - phoneme classification - phoneme recognition. The segmentation task provides as output phoneme segment which are subsequently classified according to their nature (consonant or vowel voiced or unvoiced etc.). The segmentation and classification are based exclusively on a fuzzy approach while the phoneme recognition task exploits the acoustic features such as the formants for vowels and the pitch and intensity for consonants. Experiments were per- formed on Fongbe language (an African tonal language spoken especially in Benin, Togo and Nigeria) and results of phoneme error rate are reported.

[1]  Pavel Matejka,et al.  Hierarchical Structures of Neural Networks for Phoneme Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Cina Motamed,et al.  Automatic Fongbe Phoneme Recognition From Spoken Speech Signal , 2016, ICINCO.

[3]  Dimitri Palaz,et al.  End-to-end Phoneme Sequence Recognition using Convolutional Neural Networks , 2013, ArXiv.

[4]  Marco Gori,et al.  A survey of hybrid ANN/HMM models for automatic speech recognition , 2001, Neurocomputing.

[5]  Cina Motamed,et al.  An Algorithm Based on Fuzzy Logic for Text-Independent Fongbe Speech Segmentation , 2015, 2015 11th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS).

[6]  Peter Sollich,et al.  Tuning support vector machines for robust phoneme classification with acoustic waveforms , 2009, INTERSPEECH.

[7]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[8]  S D Umarani,et al.  Implementation of HMM and radial basis function for speech recognition , 2009, 2009 International Conference on Intelligent Agent & Multi-Agent Systems.

[9]  Steve Young,et al.  HMMs and related speech recognition technologies , 2008 .

[10]  Cina Motamed,et al.  Adaptive decision-level fusion for Fongbe phoneme classification using fuzzy logic and Deep Belief Networks , 2015, 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO).

[11]  Hynek Hermansky,et al.  Modulation frequency features for phoneme recognition in noisy speech. , 2009, The Journal of the Acoustical Society of America.

[12]  Carmen Peláez-Moreno,et al.  Robust ASR using Support Vector Machines , 2007, Speech Commun..