Comparative Performance Analysis for Speech Digit Recognition based on MFCC and Vector Quantization

Abstract The main goal of this research work is to experimentally verify the importance of spoken Speech digit signal in person authentication in controlling applications. The motivation is based on the earlier work of demonstrating the feasibility of using spoken speech digit utterance signal for person security and controlling applications. This paper work also discusses the. Comparative analysis of the cepstral analysis with the mel frequency cepstral coefficient (MFCC) by using vector quantization feature matching technique. All digits speech digit from zero utterance to nine digit utterance data has been collected for 15 subjects in three different sessions. For the thus collected spoken speech digit data, the feature extraction techniques such as cepstral and MFCC were applied to extract the Cepstral and MFCC features. In the next stage of work vector quantization was used for feature matching for both Cepstral and MFCC features and performance were recorded for two different session data. By comparing the performance of Cepstral plus VQ with the MFCC plus VQ, we can conclude that feature extraction technique MFCC gives the better performance than cepstral feature for spoken digit utterance data.

[1]  K. P. Jacob,et al.  Feature Extraction Methods Based on Linear Predictive Coding and Wavelet Packet Decomposition for Recognizing Spoken Words in Malayalam , 2012, 2012 International Conference on Advances in Computing and Communications.

[2]  Poorna S S,et al.  Digit Identification from Speech using Short-Time Domain Features , 2020, 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA).

[3]  Sustainable Communication Networks and Application , 2020, Lecture Notes on Data Engineering and Communications Technologies.

[4]  R. P. Prado,et al.  A Block Bi-Diagonalization-Based Pre-Coding for Indoor Multiple-Input-Multiple-Output-Visible Light Communication System , 2020, Energies.

[5]  K. Somaiya Isolated Digit Recognition Using MFCC AND DTW , 2012 .

[6]  Ali Ganoun,et al.  Speech Recognition of Arabic Spoken Digits , 2013 .

[7]  Bakht Zada,et al.  Pashto isolated digits recognition using deep convolutional neural network , 2020, Heliyon.

[8]  R. Sharan Spoken Digit Recognition Using Wavelet Scalogram and Convolutional Neural Networks , 2020, 2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS).

[9]  Uttered Kurdish digit recognition system , 2019, Journal of University of Raparin.

[10]  David J. Burr,et al.  Experiments on neural net recognition of spoken and written text , 1988, IEEE Trans. Acoust. Speech Signal Process..

[11]  M. D. Rudresh,et al.  Performance analysis of speech digit recognition using cepstrum and vector quantization , 2017, 2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT).

[12]  Ngoc-Tu Nguyen,et al.  Novel Framework Based on HOSVD for Ski Goggles Defect Detection and Classification , 2019, Sensors.

[13]  Rajashekarappa,et al.  Controlled partial image encryption based on LSIC and chaotic map , 2019, ICCSP.

[14]  Salam,et al.  Spoken English Alphabet Recognition with Mel Frequency Cepstral Coefficients and Back Propagation Neural Networks , 2012 .

[15]  Y. Ishida,et al.  DP matching-based spoken digit recognition using LVQ , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[16]  Comparative Analysis to Identify Efficient Technique for Interfacing BCI System , 2020 .

[17]  P. Denes,et al.  Spoken Digit Recognition Using Time‐Frequency Pattern Matching , 1960 .

[18]  Lionel Tarassenko,et al.  Spoken Letter Recognition with Neural Networks , 1991, Int. J. Neural Syst..