This This research constructs a phonetic feature (PF) table for all the phonemes pronounced in Bangla (widely known as Bengali) language where the whole study is divided into two parts. In the first part, a PF table is constructed, while the second part deals with Bangla automatic speech recognition (ASR) using PFs. For Bangla language, fifty three phonemes including both vowels and consonants are considered in which the phones, k (/s/) and m (/s/), and, Y (/n/) and b (/n/) contain approximately same spectrum and hence, they share same PFs. In the PF table, twenty two PFs (Silence, Short Silence, Stop, ...) are required for representing all the Bangla phonemes. On the other hand, the second part comprised of three stages: i) first stage deals with acoustic features, mel frequency cepstral coefficients (MFCCs) extraction, ii) second stage embeds PFs extraction procedure using a multilayer neural network (MLN) and iii) the final stage integrates a triphone-based hidden Markov model (HMM) for generating the output text strings by inputting log values of twenty two dimensional PFs. In the experiments on Bangla Newspaper Article Sentences, it is observed that the PF-based ASR system provides higher word correct rate, word accuracy and sentence correct rate in comparison with the standard MFCC-based method.
[1]
Simon King,et al.
Detection of phonological features in continuous speech using neural networks
,
2000,
Comput. Speech Lang..
[2]
Ghulam Muhammad,et al.
Automatic speech recognition for Bangla digits
,
2009,
2009 12th International Conference on Computers and Information Technology.
[3]
Gernot A. Fink,et al.
Combining acoustic and articulatory feature information for robust speech recognition
,
2002,
Speech Commun..
[4]
Simon King,et al.
Speech recognition via phonetically featured syllables
,
1998,
ICSLP.
[5]
Colin P. Masica.
The Indo-Aryan Languages
,
1991
.
[6]
Takashi Fukuda,et al.
Noise-robust ASR by using distinctive phonetic features approximated with logarithmic normal distribution of HMM
,
2003,
INTERSPEECH.
[7]
Mumit Khan,et al.
Isolated and continuous bangla speech recognition: implementation, performance and application perspective
,
2007
.
[8]
Syed Akhter Hossain,et al.
Bangla Vowel Characterization Based on Analysis by Synthesis
,
2007
.
[9]
A. Black,et al.
1 Experiments with Unit Selection Speech Databases for Indian Languages
,
2003
.