Dynamic language model adaptation using variational Bayes inference

We propose an unsupervised dynamic language model (LM) adaptation framework using long-distance latent topic mixtures. The framework employs the Latent Dirichlet Allocation model (LDA) which models the latent topics of a document collection in an unsupervised and Bayesian fashion. In the LDA model, each word is modeled as a mixture of latent topics. Varying topics within a context can be modeled by re-sampling the mixture weights of the latent topics from a prior Dirichlet distribution. The model can be trained using the variational Bayes Expectation Maximization algorithm. During decoding, mixture weights of the latent topics are adapted dynamically using the hypotheses of previously decoded utterances. In our work, the LDA model is combined with the trigram language model using linear interpolation. We evaluated the approach on the CCTV episode of the RT04 Mandarin Broadcast News test set. Results show that the proposed approach reduces the perplexity by up to 15.4% relative and the character error rate by 4.9% relative depending on the size and setup of the training set.

[1]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[2]  J.R. Bellegarda,et al.  Exploiting latent semantic information in statistical language modeling , 2000, Proceedings of the IEEE.

[3]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[4]  Philip Clarkson,et al.  The applicability of adaptive language modelling for the broadcast news task , 1998, ICSLP.

[5]  Tanja Schultz,et al.  The ISL RT04 Mandarin Broadcast News Evaluation System , 2004 .

[6]  Anthony J. Robinson,et al.  Language model adaptation using mixtures and an exponentially decaying cache , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Philip C. Woodland,et al.  A PLSA-based language model for conversational telephone speech , 2004, INTERSPEECH.

[8]  Thomas Hofmann,et al.  Topic-based language models using EM , 1999, EUROSPEECH.

[9]  Renato De Mori,et al.  A Cache-Based Natural Language Model for Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Mari Ostendorf,et al.  Modeling long distance dependence in language: topic mixtures vs. dynamic cache models , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.