Robust feature extraction based on spectral peaks of group delay and autocorrelation function and phase domain analysis

This paper presents a new robust feature set for noisy speech recognition in phase domain along with spectral peaks obtained from group delay and autocorrelation functions. The group delay domain is appropriate for formant tracking and autocorrelation domain is well-known for its pole preserving and noise separation properties. In this paper, we report on appending spectral peaks obtained in either group delay or autocorrelation domains to the feature vectors extracted originally in phase domain to create a new feature set. We tested our features on the Aurora 2 noisy isolated-word task and found that it led to improvements over other group delay-based and autocorrelation-based methods that use magnitude instead of phase for feature extraction .

[1]  Hervé Bourlard,et al.  Phase autocorrelation (PAC) derived robust speech features , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  Hsiao-Chuan Wang,et al.  Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences , 1999, Speech Commun..

[3]  Gholamreza Farahani,et al.  ROBUST FEATURES FOR NOISY SPEECH RECOGNITION BASED ON FILTERING AND SPECTRAL PEAKS IN AUTOCORRELATION DOMAIN , 2005 .

[4]  Mohammad Mehdi Homayounpour,et al.  Use of Spectral Peaks in Autocorrelation and Group Delay Domains for Robust Speech Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5]  Kuldip K. Paliwal,et al.  MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition , 2004, INTERSPEECH.

[6]  Abeer Alwan,et al.  Robust word recognition using threaded spectral peaks , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  B. Yegnanarayana,et al.  Processing of noisy speech using modified group delay functions , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Rajesh M. Hegde,et al.  Continuous speech recognition using joint features derived from the modified group delay function and MFCC , 2004, INTERSPEECH.

[9]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[10]  Christophe d'Alessandro,et al.  Improved differential phase spectrum processing for formant tracking , 2004, INTERSPEECH.