论文信息 - Topic n-gram count language model adaptation for speech recognition

Topic n-gram count language model adaptation for speech recognition

We introduce novel language model (LM) adaptation approaches using the latent Dirichlet allocation (LDA) model. Observed n-grams in the training set are assigned to topics using soft and hard clustering. In soft clustering, each n-gram is assigned to topics such that the total count of that n-gram for all topics is equal to the global count of that n-gram in the training set. Here, the normalized topic weights of the n-gram are multiplied by the global n-gram count to form the topic n-gram count for the respective topics. In hard clustering, each n-gram is assigned to a single topic with the maximum fraction of the global n-gram count for the corresponding topic. Here, the topic is selected using the maximum topic weight for the n-gram. The topic n-gram count LMs are created using the respective topic n-gram counts and adapted by using the topic weights of a development test set. We compute the average of the confidence measures: the probability of word given topic and the probability of topic given word. The average is taken over the words in the n-grams and the development test set to form the topic weights of the n-grams and the development test set respectively. Our approaches show better performance over some traditional approaches using the WSJ corpus.

Douglas D. O'Shaughnessy | Md. Akmal Haidar

[1] Thomas Hofmann,et al. Topic-based language models using EM , 1999, EUROSPEECH.

[2] Ronald Rosenfeld,et al. A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[3] J.R. Bellegarda,et al. Exploiting latent semantic information in statistical language modeling , 2000, Proceedings of the IEEE.

[4] Douglas D. O'Shaughnessy,et al. Novel weighting scheme for unsupervised language model adaptation using latent dirichlet allocation , 2010, INTERSPEECH.

[5] Feifan Liu,et al. Unsupervised Language Model Adaptation Incorporating Named Entity Information , 2007, ACL.

[6] Gregor Heinrich. Parameter estimation for text analysis , 2009 .

[7] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[8] Renato De Mori,et al. A Cache-Based Natural Language Model for Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[9] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[10] Tanja Schultz,et al. Dynamic language model adaptation using variational Bayes inference , 2005, INTERSPEECH.

[11] Jerome R. Bellegarda,et al. Statistical language model adaptation: review and perspectives , 2004, Speech Commun..

[12] Tanja Schultz,et al. Unsupervised language model adaptation using latent semantic marginals , 2006, INTERSPEECH.

[13] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14] Douglas D. O'Shaughnessy,et al. Unsupervised language model adaptation using n-gram weighting , 2011, 2011 24th Canadian Conference on Electrical and Computer Engineering(CCECE).

[15] Mark Steyvers,et al. Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[16] Steve J. Young,et al. Large vocabulary continuous speech recognition using HTK , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[17] Hung-An Chang,et al. Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm , 2007, INTERSPEECH.