论文信息 - Large-Scale Language Modeling with Random Forests for Mandarin Chinese Speech-to-Text

Large-Scale Language Modeling with Random Forests for Mandarin Chinese Speech-to-Text

In this work the random forest language modeling approach is applied with the aim of improving the performance of the LIMSI, highly competitive, Mandarin Chinese speech-to-text system. The experimental setup is that of the GALE Phase 4 evaluation. This setup is characterized by a large amount of available language model training data (over 3.2 billion segmented words). A conventional unpruned 4-gram language model with a vocabulary of 56K words serves as a baseline that is challenging to improve upon. However moderate perplexity and CER improvements over this model were obtained with a random forest language model. Different random forest training strategies were explored so as to attain the maximal gain in performance and Forest of Random Forest language modeling scheme is introduced.

Jean-Luc Gauvain | Lori Lamel | Ilya Oparin

[1] Yi Su,et al. Knowledge integration into language models: a random forest approach , 2009 .

[2] Lukás Burget,et al. Morphological random forests for language modeling of inflectional languages , 2008, 2008 IEEE Spoken Language Technology Workshop.

[3] Jean-Luc Gauvain,et al. The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[4] Mark J. F. Gales,et al. Exploiting Chinese character models to improve speech recognition performance , 2009, INTERSPEECH.

[5] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[6] Pascale Fung,et al. Improving Chinese Tokenization With Linguistic Filters On Statistical Lexical Acquisition , 1994, ANLP.

[7] Jun Luo,et al. Modeling characters versuswords for mandarin speech recognition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8] Peng Xu,et al. Random Forests in Language Modelin , 2004, EMNLP.

[9] Jean-Luc Gauvain,et al. Improved acoustic modeling for transcribing Arabic broadcast data , 2007, INTERSPEECH.

[10] Lalit R. Bahl,et al. A tree-based statistical language model for natural language speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[11] Yi Su,et al. Large-scale random forest language models for speech recognition , 2007, INTERSPEECH.

[12] Qin Jin,et al. Phonetic speaker recognition using maximum-likelihood binary-decision tree models , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[14] Peng Xu,et al. Random forests and the data sparseness problem in language modeling , 2007, Comput. Speech Lang..

[15] Jean-Luc Gauvain,et al. Training Neural Network Language Models on Very Large Corpora , 2005, HLT.

[16] Chilin Shih,et al. A Stochastic Finite-State Word-Segmentation Algorithm for Chinese , 1994, ACL.

[17] Jean-Luc Gauvain,et al. MODELING CHARACTERS VERSUS WORDS FOR MANDARIN SPEECH RECOGNITION , 2009 .