On recognition of spoken Bengali numerals

This paper presents a method for recognizing isolated spoken Bengali numerals. Noisy audio samples have been considered as input in this study. Mel frequency cepstral coefficients (MFCC) have been used for extraction of feature from the audio samples. Vector quantization is applied to reduce the dimension of the feature vectors and to generate a vector codebook for the numerals. The classification is based on the dynamic time warping (DTW) and a minimum distance classifier based on Euclidean distance measure. Both the speaker dependent and speaker independent situations have been considered for checking accuracy. Results show the limitations of MFCC based standard speech processing approach in speaker independent spoken digit recognition scenario in the presence of noise.

[1]  Boling Xu,et al.  Binary quantization of feature vectors for robust text-independent speaker identification , 1999, IEEE Trans. Speech Audio Process..

[2]  Aaron E. Rosenberg,et al.  Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[3]  Steve Young,et al.  A review of large-vocabulary continuous-speech , 1996, IEEE Signal Process. Mag..

[4]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[5]  Steve Young,et al.  A review of large-vocabulary continuous-speech recognition , 1996 .

[6]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[7]  Damjan Vlaj,et al.  ROBUST MFCC FEATURE EXTRACTION ALGORITHM USING EFFICIENT ADDITIVE AND CONVOLUTIONAL NOISE REDUCTION PROCEDURES , 2002 .

[8]  Ghulam Muhammad,et al.  Automatic speech recognition for Bangla digits , 2009, 2009 12th International Conference on Computers and Information Technology.

[9]  Poonam Bansal,et al.  Optimum HMM combined with vector Quantization for Hindi Speech word Recognition , 2008 .

[10]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[11]  Anup Kumar Paul,et al.  Bangla Speech Recognition System Using LPC and ANN , 2009, 2009 Seventh International Conference on Advances in Pattern Recognition.

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.