Web Augmentation of Language Models for Continuous Speech Recognition of SMS Text Messages

In this paper, we present an efficient query selection algorithm for the retrieval of web text data to augment a statistical language model (LM). The number of retrieved relevant documents is optimized with respect to the number of queries submitted. The querying scheme is applied in the domain of SMS text messages. Continuous speech recognition experiments are conducted on three languages: English, Spanish, and French. The web data is utilized for augmenting in-domain LMs in general and for adapting the LMs to a user-specific vocabulary. Word error rate reductions of up to 6.6% (in LM augmentation) and 26.0% (in LM adaptation) are obtained in setups, where the size of the web mixture LM is limited to the size of the baseline in-domain LM.

[1]  Yan Huang,et al.  Vocabulary and language model adaptation using information retrieval , 2004, INTERSPEECH.

[2]  Andreas Stolcke,et al.  Web resources for language modeling in conversational speech recognition , 2007, TSLP.

[3]  Steve J. Young,et al.  Bootstrapping language models for dialogue systems , 2006, INTERSPEECH.

[4]  Andreas Stolcke,et al.  Entropy-based Pruning of Backoff Language Models , 2000, ArXiv.

[5]  Ronald Rosenfeld,et al.  Improving trigram language modeling with the World Wide Web , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[6]  Tatsuya Kawahara,et al.  A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web texts , 2006, INTERSPEECH.

[7]  Bhuvana Ramabhadran,et al.  Data Driven Approach for Language Model Adaptation using Stepwise Relative Entropy Minimization , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Thomas Hain,et al.  Strategies for Language Model Web-Data Collection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Andreas Stolcke,et al.  Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition Evaluation System , 2005 .

[10]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[11]  Andreas Stolcke,et al.  Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures , 2003, NAACL.

[12]  Ruhi Sarikaya,et al.  Rapid language model development using external resources for new spoken dialog domains , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[13]  Robert Miller,et al.  Just-in-time language modelling , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[14]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[15]  Thorsten Brants,et al.  Large Language Models in Machine Translation , 2007, EMNLP.

[16]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[17]  Guo-Hong Ding,et al.  A decoder for large vocabulary continuous short message dictation on embedded devices , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[19]  Teemu Hirsimäki,et al.  On Growing and Pruning Kneser–Ney Smoothed $ N$-Gram Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.