Role of Spectral Peaks in Autocoorelation Domain for Robust Speech Recognition

This paper presents a new front-end for robust speech recognition. This new front-end scenario focuses on the spectral features of the filtered speech signals in the autocorrelation domain. The autocorrelation domain is well known for its pole preserving and noise separation properties. In this paper we will use the autocorrelation domain as an appropriate candidate for robust feature extraction. The proposed method introduces a novel representation of speech for the cases where the speech signal is corrupted by additive noises. In this method, the speech features are computed by reducing additive noise effects via an initial filtering stage, followed by the extraction of autocorrelation spectrum peaks. Robust features based on theses peaks are derived by assuming that the corrupting noise is stationary in nature. A task of speaker-independent isolated-word recognition is used to demonstrate the efficiency of these robust features. The cases of white noise and colored noise such as factory, babble and F16 are tested. Experimental results show significant improvement in comparison to the results obtained using traditional front end methods. Further enhancement has been done by applying cepstral mean normalization (CMN) on the above extracted features.

[1]  Mark J. F. Gales,et al.  Robust speech recognition in additive and convolutional noise using parallel model combination , 1995, Comput. Speech Lang..

[2]  Yifan Gong,et al.  Speech recognition in noisy environments: A survey , 1995, Speech Commun..

[3]  Mark A. Clements,et al.  A projection-based likelihood measure for speech recognition in noise , 1994, IEEE Trans. Speech Audio Process..

[4]  Kuldip K. Paliwal,et al.  Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition , 2006, Speech Commun..

[5]  Laurent Mauuary,et al.  Blind equalization for robust telephone based speech recognition , 1996, 1996 8th European Signal Processing Conference (EUSIPCO 1996).

[6]  Abeer Alwan,et al.  Robust word recognition using threaded spectral peaks , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Hsiao-Chuan Wang,et al.  Robust features derived from temporal trajectory filtering for speech recognition under the corruption of additive and convolutional noises , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Climent Nadeu,et al.  Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition , 1997, IEEE Trans. Speech Audio Process..

[9]  Biing-Hwang Juang,et al.  A family of distortion measures based upon projection operation for robust speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[10]  Michael J. Carey Robust speech recognition using non-linear spectral smoothing , 2003, INTERSPEECH.

[11]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[12]  Poonam Bansal,et al.  Optimum HMM combined with vector Quantization for Hindi Speech word Recognition , 2008 .

[13]  Satoshi Nakamura,et al.  Cepstrum derived from differentiated power spectrum for robust speech recognition , 2003, Speech Commun..

[14]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[15]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[16]  Hsiao-Chuan Wang,et al.  Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences , 1999, Speech Commun..

[17]  Satya Dharanipragada,et al.  Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..