论文信息 - A Non-Linear Speaker Adaptation Technique using Kernel Ridge Regression

A Non-Linear Speaker Adaptation Technique using Kernel Ridge Regression

We propose a non-linear model space transformation for speaker or environment adaptation based on weighted kernel ridge regression (KRR). The transformation is given by a generalized least squares linear regression in a kernel-induced feature space operating on Gaussian mixture model means and having as targets the adaptation frames. Using the "kernel trick", the solution to the optimization problem is obtained by solving a system of linear equations involving the Gram matrix of the input variables. We show that MLLR is a special case of KRR when a linear kernel is employed. Furthermore, we study an efficient low-rank approximation to the kernel matrix termed "rectangle method", where the regressors are chosen to be a small set of clustered adaptation frames. Experiments conducted on the EARS database (English conversational telephone speech) indicate that KRR with a Gaussian RBF kernel outperforms standard regression class-based MLLR

George Saon | G. Saon

[1] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[2] Hakan Erdogan,et al. KERNEL DISCRIMINANT ANALYSIS FOR SPEECH RECOGNITION , 2004 .

[3] Peder A. Olsen,et al. Feature adaptation using projection of Gaussian posteriors , 2005, INTERSPEECH.

[4] Roger Hsiao,et al. Improving eigenspace-based MLLR adaptation by kernel PCA , 2004, INTERSPEECH.

[5] Mark J. F. Gales,et al. Temporally varying model parameters for large vocabulary continuous speech recognition , 2005, INTERSPEECH.

[6] Chin-Hui Lee,et al. Maximum a posteriori linear regression for hidden Markov model adaptation , 1999, EUROSPEECH.

[7] Geoffrey Zweig,et al. The IBM 2004 conversational telephony system for rich transcription , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[8] George Saon,et al. Feature space Gaussianization , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Geoffrey Zweig,et al. fMPE: discriminatively trained features for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10] Philip C. Woodland,et al. Speaker adaptation of HMMs using linear regression , 1994 .

[11] Jing Peng,et al. SVM vs regularized least squares classification , 2004, ICPR 2004.

[12] Mukund Padmanabhan,et al. A nonlinear unsupervised adaptation technique for speech recognition , 2000, INTERSPEECH.