论文信息 - Recognition of multilingual speech in mobile applications

Recognition of multilingual speech in mobile applications

We evaluate different architectures to recognize multilingual speech for real-time mobile applications. In particular, we show that combining the results of several recognizers greatly outperforms other solutions such as training a single large multilingual system or using an explicit language identification system to select the appropriate recognizer. Experiments are conducted on a trilingual English-French-Mandarin mobile speech task. The data set includes Google searches, Maps queries, as well as more general inputs such as email and short message dictation. Without pre-specifying the input language, the combined system achieves comparable accuracy to that of the monolingual systems when the input language is known. The combined system is also roughly 5% absolute better than an explicit language identification approach, and 10% better than a single large multilingual system.

Hui Lin | Jui-Ting Huang | Françoise Beaufays | Yun-Hsuan Sung | Brian Strope

[1] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[2] Tanja Schultz,et al. Language-independent and language-adaptive acoustic modeling for speech recognition , 2001, Speech Commun..

[3] Tanja Schultz,et al. Language independent and language adaptive large vocabulary speech recognition , 1998, ICSLP.

[4] Ngoc Thang Vu,et al. Rapid Building of an ASR System for Under-Resourced Languages Based on Multilingual Unsupervised Training , 2011, INTERSPEECH.

[5] Joachim Köhler. Language adaptation of multilingual phone models for vocabulary independent speech recognition tasks , 1998, ICASSP.

[6] Brian Kingsbury,et al. Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7] G. Richard Tucker,et al. Bilingual education in the 21st century: a global perspective , 2011 .

[8] Jiulong Shan,et al. Search by voice in Mandarin Chinese , 2010, INTERSPEECH.

[9] Hung-An Chang,et al. Recognizing English queries in Mandarin Voice Search , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10] J. Kohler. Language adaptation of multilingual phone models for vocabulary independent speech recognition tasks , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11] William M. Campbell,et al. Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[12] G. Richard Tucker,et al. A Global Perspective on Bilingualism and Bilingual Education. ERIC Digest. , 1999 .

[13] Michiel Bacchiani,et al. Discriminative Features for Language Identification , 2011, INTERSPEECH.