Speech processing using joint features derived from the modified group delay function

The paper discusses the significance of joint cepstral features derived from the modified group delay function and MFCC in speech processing. We start with a definition of cepstral features derived from the modified group delay function called the modified group delay feature (MODGDF) which is derived from the Fourier transform phase. Robustness issues like similarities of the MODGDF to RASTA and cepstral mean subtraction are discussed. The efficiency with which formants can be reconstructed for noisy cellular speech using joint features derived from early fusion is illustrated. The joint features are used for four speech processing tasks phoneme, syllable, speaker, and language recognition. Based on the results of analysis and performance evaluation, the significance of joint features derived from the MODGDF and MFCC are discussed.

[1]  Daniel P. W. Ellis STREAM COMBINATION BEFORE AND/OR AFTER THE ACOUSTIC MODEL , 1999 .

[2]  Hermann von Helmholtz,et al.  On the Sensations of Tone , 1954 .

[3]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[4]  Ronald A. Cole,et al.  The OGI multi-language telephone speech corpus , 1992, ICSLP.

[5]  Günther Palm,et al.  Effects of phase on the perception of intervocalic stop consonants , 1997, Speech Commun..

[6]  Rajesh M. Hegde,et al.  Application of the modified group delay function to speaker identification and discrimination , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  H. Helmholtz,et al.  On the Sensations of Tone as a Physiological Basis for the Theory of Music , 2005 .

[8]  Harris Drucker Speech processing in a high ambient noise environment , 1967 .

[9]  H.A. Murthy,et al.  An alternative representation of speech using the modified group delay feature , 2004, 2004 International Conference on Signal Processing and Communications, 2004. SPCOM '04..

[10]  Kuldip K. Paliwal,et al.  Usefulness of phase in speech processing , 2003 .

[11]  Bayya Yegnanarayana,et al.  Significance of group delay functions in spectrum estimation , 1992, IEEE Trans. Signal Process..

[12]  Hema A. Murthy,et al.  The modified group delay function and its application to phoneme recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..