Optimized discriminative transformations for speech features based on minimum classification error

Feature extraction is an important step in pattern classification and speech recognition. Extracted features should discriminate classes from each other while being robust to the environmental conditions such as noise. For this purpose, some transformations are applied to features. In this paper, we propose a framework to improve independent feature transformations such as PCA (Principal Component Analysis), and HLDA (Heteroscedastic LDA) using the minimum classification error criterion. In this method, we modify full transformation matrices such that classification error is minimized for mapped features. We do not reduce feature vector dimension in this mapping. The proposed methods are evaluated for continuous phoneme recognition on clean and noisy TIMIT. Experimental results show that our proposed methods improve performance of PCA, and HLDA transformation for MFCC in both clean and noisy conditions.

[1]  Jeih-Weih Hung,et al.  Optimization of temporal filters for constructing robust features in speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Dong Yu,et al.  A Generative Modeling Framework for Structured Hidden Speech Dynamics , 2005 .

[3]  Alain Biem,et al.  Filter bank design based on discriminative feature extraction , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[5]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[6]  Geoffrey Zweig,et al.  fMPE: discriminatively trained features for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  David G. Stork,et al.  Pattern Classification , 1973 .

[8]  Kuldip K. Paliwal,et al.  Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition , 2003, Pattern Recognit..

[9]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Mark J. F. Gales Maximum likelihood multiple subspace projections for hidden Markov models , 2002, IEEE Trans. Speech Audio Process..

[11]  A. Akbari,et al.  Optimized linear discriminant analysis for extracting robust speech features , 2008, 2008 3rd International Symposium on Communications, Control and Signal Processing.

[12]  Stephen P. Boyd,et al.  Optimal kernel selection in Kernel Fisher discriminant analysis , 2006, ICML.

[13]  László Tóth,et al.  Kernel-based feature extraction with a speech technology application , 2004, IEEE Transactions on Signal Processing.

[14]  Jeih-Weih Hung,et al.  Data-driven temporal filters for robust features in speech recognition obtained via Minimum Classification Error (MCE) , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Antonio M. Peinado,et al.  An application of minimum classification error to feature space transformations for speech recognition , 1996, Speech Commun..

[16]  Ninad Thakoor,et al.  Hidden Markov Model-Based Weighted Likelihood Discriminant for 2-D Shape Classification , 2007, IEEE Transactions on Image Processing.

[17]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[18]  Jinyu Li,et al.  Dimensionality reduction using MCE-optimized LDA transformation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Panu Somervuo,et al.  Experiments with linear and nonlinear feature transformations in HMM based phone recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[20]  S. Katagiri,et al.  Discriminative Learning for Minimum Error Classification , 2009 .

[21]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[22]  Robert P. W. Duin,et al.  Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Qiang Huo,et al.  A Study of Minimum Classification Error (MCE) Linear Regression for Supervised Adaptation of MCE-Trained Continuous-Density Hidden Markov Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[24]  Spyridon Matsoukas,et al.  Minimum phoneme error based heteroscedastic linear discriminant analysis for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[25]  Ahmad Akbari,et al.  SNR-dependent compression of enhanced Mel sub-band energies for compensation of noise effects on MFCC features , 2007, Pattern Recognit. Lett..

[26]  Sam Kwong,et al.  Minimum classification error rate method using genetic algorithms , 2000, 2000 26th Annual Conference of the IEEE Industrial Electronics Society. IECON 2000. 2000 IEEE International Conference on Industrial Electronics, Control and Instrumentation. 21st Century Technologies.

[27]  Alain Biem,et al.  Feature extraction based on minimum classification error/generalized probabilistic descent method , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[28]  Xin Yao,et al.  Linear dimensionality reduction using relevance weighted LDA , 2005, Pattern Recognit..