Incremental on-line feature space MLLR adaptation for telephony speech recognition

In this paper, we present a method for incremental on-line adaptation based on feature space Maximum Likelihood Linear Regression (FMLLR) for telephony speech recognition applications. We explain how to incorporate a feature space MLLR transform into a stack decoder and perform on-line adaptation. The issues discussed are as follows: collecting adaptation data on-line and in real time; mapping adaptation data from previous feature space to the present feature space; and smoothing adaptation statistics with initial statistics based on original acoustical model to achieve stability. Testing results on various systems demonstrate that on-line incremental FMLLR adaptation could be an effective and stable method when the adaptation statistics are mapped and smoothed.

[1]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[2]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[3]  William J. Byrne,et al.  Discounted likelihood linear regression for rapid adaptation , 1999, EUROSPEECH.

[4]  Michael Picheny,et al.  Maximal rank likelihood as an optimization function for speech recognition , 2000, INTERSPEECH.

[5]  Geoffrey Zweig,et al.  Linear feature space projections for speaker adaptation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[6]  Benoît Maison,et al.  Robust confidence annotation and rejection for continuous speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).