Improved speaker adaptation using speaker dependent feature projections

We extend the formulation of constrained maximum likelihood linear regression (CMLLR) adaptation to take into account full covariance matrices in the adapted model, and we use it in conjunction with heteroscedastic linear discriminant analysis (HLDA) in order to estimate speaker dependent feature projections on both training and test data. Results on the broadcast news corpus show that the proposed HLDA adaptation technique is very effective, even when combined with traditional CMLLR and MLLR adaptation, providing up to 8% relative improvement in recognition accuracy.

[1]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[2]  George Saon,et al.  Eliminating inter-speaker variability prior to discriminant transforms , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[3]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[4]  Andreas G. Andreou,et al.  Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition , 1998, Speech Commun..

[5]  Geoffrey Zweig,et al.  Linear feature space projections for speaker adaptation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[6]  George Saon,et al.  Maximum likelihood discriminant feature spaces , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[7]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[8]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[9]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[10]  Philip C. Woodland Speaker adaptation for continuous density HMMs: a review , 2001 .

[11]  Richard M. Schwartz,et al.  A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.