Acoustic Feature Comparison of MFCC and CZT-Based Cepstrum for Speech Recognition

The speech cepstral features are important parameter in Automatic Speech Recognition (ASR), which symbolizes the property of human auditory system (HAS). The Mel-Frequency Cepstral Coefficients (MFCC) are the most widely used features in speech recognition field. This paper discusses about the algorithm of Chirp Z-Transform (CZT), and the CZT-based cepstral coefficients are proposed along with the corresponding method of feature extraction. We used MATLAB to perform the experiments. Simulation results show the correctness and effectiveness of the MFCC and the CZT-based cepstrum in speech recognition for Mandarin digits recognition. The recognition rate of MFCC algorithm is compared with Chirp Z-Transform for speech recognition system. The inclusion of cepstrum CZT-based features in parameters space may improve the correct rate of speech recognition.

[1]  John S. D. Mason,et al.  A comparison of composite features under degraded speech in speaker recognition , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Douglas D. O'Shaughnessy,et al.  Compensated mel frequency cepstrum coefficients , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Louis C. W. Pols,et al.  Spectral analysis and identification of Dutch vowels in monosyllabic words , 1977 .

[4]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.