Unsupervised language model adaptation

This paper investigates unsupervised language model adaptation, from ASR transcripts. N-gram counts from these transcripts can be used either to adapt an existing n-gram model or to build an n-gram model from scratch. Various experimental results are reported on a particular domain adaptation task, namely building a customer care application starting from a general voicemail transcription system. The experiments investigate the effectiveness of various adaptation strategies, including iterative adaptation and self-adaptation on the test data. They show an error rate reduction of 3.9% over the unadapted baseline performance, from 28% to 24.1%, using 17 hours of unsupervised adaptation material. This is 51% of the 7.7% adaptation gain obtained by supervised adaptation. Self-adaptation on the test data resulted in a 1.3% improvement over the baseline.

[1]  Giuseppe Riccardi,et al.  On-line learning of language models with word error probability distributions , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[3]  Andreas Stolcke,et al.  Finding consensus among words: lattice-based word error minimization , 1999, EUROSPEECH.

[4]  Jean-Luc Gauvain,et al.  Unsupervised acoustic model training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Andrej Ljolje,et al.  The AT&T LVCSR-2000 System , 2000 .

[6]  Thomas Hain,et al.  The 1998 HTK broadcast news transcription system: development and results , 1999 .

[7]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[8]  Michiel Bacchiani Automatic transcription of voicemail at AT&T , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..