Self-adaptation using eigenvoices for large-vocabulary continuous speech recognition

In this paper, we present the application of eigenvoices to self-adaptation. This adaptation algorithm happens to be rather well-suited for such a task. First, it is an extremely fast adaptation algorithm, and thus well tailored to work for very short amounts of adaptation data. It is also believed to be rather more tolerant of errorful recognition. A third property is the explicit aim to reduce the dimensionality that translates into compact computation of the likelihood. This can be exploited as an embedded confidence measure to minimize the impact of errors in the transcription. Our experiments were carried out on the Wall Street Journal evaluation task (WSJ). We reduced our word error rate (WER) by one percent absolute to 9.7%.

[1]  Roland Kuhn,et al.  Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..

[2]  Jean-Claude Junqua,et al.  EWAVES: an efficient decoding algorithm for lexical tree based speech recognition , 2000, INTERSPEECH.

[3]  Hermann Ney,et al.  Improved MLLR speaker adaptation using confidence measures for conversational speech recognition , 2000, INTERSPEECH.

[4]  Jean-Claude Junqua,et al.  Maximum likelihood eigenspace and MLLR for speech recognition in noisy environments , 1999, EUROSPEECH.

[5]  Tasos Anastasakos,et al.  The use of confidence measures in unsupervised adaptation of speech recognizers , 1998, ICSLP.

[6]  Eduardo Lleida,et al.  Utterance verification in continuous speech recognition: decoding and training procedures , 2000, IEEE Trans. Speech Audio Process..

[7]  Henrik Botterweck Very fast adaptation for large vocabulary continuous speech recognition using eigenvoices , 2000, INTERSPEECH.

[8]  Shingo Kuroiwa,et al.  Determination of threshold for speaker verification using speaker adaptation gain in likelihood during training , 2000, INTERSPEECH.

[9]  Jen-Tzung Chien,et al.  Extraction of reliable transformation parameters for unsupervised speaker adaptation , 1999, EUROSPEECH.

[10]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[11]  Michiel Bacchiani,et al.  Using maximum likelihood linear regression for segment clustering and speaker identification , 2000, INTERSPEECH.

[12]  Wu Chou,et al.  Maximum a posterior linear regression with elliptically symmetric matrix variate priors , 1999, EUROSPEECH.