A study of two dimensional linear discriminants for ASR

We study the information in the joint time-frequency domain using 1515 dimensional-15 spectral energies and temporal span of 1s-block of spectrogram as features. In this feature space, we first derive 20 joint linear discriminants (JLDs) using linear discriminant analysis (LDA). Using principal component analysis (PCA), we conclude that information in this block of the spectrogram can be analyzed independently across the time and frequency domains. Under this assumption, we propose a sequential design of two dimensional discriminants (CLDs), i.e., spectral discriminants followed by temporal discriminants. We show that these CLDs are similar to first few JLDs and the discriminant features derived from the CLDs outperform those obtained from JLDs in the continuous-digit recognition task.

[1]  C. Lefebvre,et al.  A comparison of several acoustic representations for speech recognition with degraded and undegraded speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[2]  Keinosuke Fukunaga,et al.  Statistical Pattern Recognition , 1993, Handbook of Pattern Recognition and Computer Vision.

[3]  Michael Picheny,et al.  Robust methods for using context-dependent features and models in a continuous speech recognizer , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Sarel van Vuuren,et al.  Data based filter design for RASTA-like channel normalization in ASR , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5]  Tsuneo Nitta A novel feature-extraction for speech recognition based on multiple acoustic-feature planes , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  R. Cole,et al.  TELEPHONE SPEECH CORPUS DEVELOPMENT AT CSLU , 1998 .

[7]  Hynek Hermansky,et al.  Spectral basis functions from discriminant analysis , 1998, ICSLP.

[8]  Saeed Vaseghi,et al.  Discriminative spectral-temporal multiresolution features for speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).