Auditory ERB like admissible wavelet packet features for TIMIT phoneme recognition

In recent years wavelet transform has been found to be an effective tool for time–frequency analysis. Wavelet transform has been used as feature extraction in speech recognition applications and it has proved to be an effective technique for unvoiced phoneme classification. In this paper a new filter structure using admissible wavelet packet is analyzed for English phoneme recognition. These filters have the benefit of having frequency bands spacing similar to the auditory Equivalent Rectangular Bandwidth (ERB) scale. Central frequencies of ERB scale are equally distributed along the frequency response of human cochlea. A new sets of features are derived using wavelet packet transform's multi-resolution capabilities and found to be better than conventional features for unvoiced phoneme problems. Some of the noises from NOISEX-92 database has been used for preparing the artificial noisy database to test the robustness of wavelet based features.

[1]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[2]  E. Wong,et al.  Comparison of linear prediction cepstrum coefficients and mel-frequency cepstrum coefficients for language identification , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[3]  B. Moore An Introduction to the Psychology of Hearing , 1977 .

[4]  Robert D. Nowak,et al.  TEMPLAR: a wavelet-based framework for pattern learning and analysis , 2004, IEEE Transactions on Signal Processing.

[5]  Rama Chellappa,et al.  Separability-based multiscale basis selection and feature extraction for signal and image classification , 1998, IEEE Trans. Image Process..

[6]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7]  S. Mallat A wavelet tour of signal processing , 1998 .

[8]  Christopher John Long,et al.  Wavelet based feature extraction for phoneme recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[10]  DeLiang Wang,et al.  A computational auditory scene analysis system for speech segregation and robust speech recognition , 2010, Comput. Speech Lang..

[11]  Omar Farooq,et al.  Wavelet based robust sub-band features for phoneme recognition , 2004 .

[12]  Israel Cohen,et al.  Single-Channel Source Separation of Audio Signals Using Bark Scale Wavelet Packet Decomposition , 2009, 2009 IEEE International Workshop on Machine Learning for Signal Processing.

[13]  Shrikanth S. Narayanan,et al.  Discriminative Wavelet Packet Filter Bank Selection for Pattern Recognition , 2009, IEEE Transactions on Signal Processing.

[14]  Ahmed Ben Hamida,et al.  Combining formant frequency based on variable order LPC coding with acoustic features for TIMIT phone recognition , 2011, Int. J. Speech Technol..

[15]  Omar Farooq,et al.  Wavelet Sub-Band Based Temporal Features for Robust Hindi Phoneme Recognition , 2010, Int. J. Wavelets Multiresolution Inf. Process..

[16]  Eduardo Pavez,et al.  Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition , 2012, Speech Commun..

[17]  James R. Glass,et al.  An Implementation of Rational Wavelets and Filter Design for Phonetic Classification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Mahesh Chandra,et al.  Admissible wavelet packet features based on human inner ear frequency response for Hindi consonant recognition , 2014, Comput. Electr. Eng..

[19]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Omar Farooq,et al.  Mel filter-like admissible wavelet packet structure for speech recognition , 2001, IEEE Signal Processing Letters.