论文信息 - Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

Abstract — Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. Several methods such as Liner Predictive Predictive Coding (LPC), Hidden Markov Model (HMM), Artificial Neural Network (ANN) and etc are evaluated with a view to identify a straight forward and effective method for voice signal. The extraction and matching process is implemented right after the Pre Processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) introduced by Sakoe Chiba has been used as features matching techniques. Since it’s obvious that the voice signal tends to have different temporal rate, the alignment is important to produce the better performance.This paper present the viability of MFCC to extract features and DTW to compare the test patterns.

[1] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[2] R. Manmatha,et al. Word image matching using dynamic time warping , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[3] Clarence Goh,et al. Robust Computer Voice Recognition Using Improved MFCC Algorithm , 2009, 2009 International Conference on New Trends in Information and Service Science.

[4] Stan Salvador,et al. FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space , 2004 .

[5] G. Carlson. Signal and Linear System Analysis , 1992 .

[6] Jérôme Boudy,et al. Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov models and the projection, for robust speech recognition in cars , 1991, Speech Commun..

[7] Chunsheng Fang. From Dynamic Time Warping (DTW) to Hidden Markov Model (HMM) Final project report for ECE742 Stochastic Decision , 2009 .

[8] A.M. Ahmad,et al. Malay language text-independent speaker verification using NN-MLP classifier with MFCC , 2008, 2008 International Conference on Electronic Design.