论文信息 - Hallucinated n-best lists for discriminative language modeling

Hallucinated n-best lists for discriminative language modeling

This paper investigates semi-supervised methods for discriminative language modeling, whereby n-best lists are “hallucinated” for given reference text and are then used for training n-gram language models using the perceptron algorithm. We perform controlled experiments on a very strong baseline English CTS system, comparing three methods for simulating ASR output, and compare the results with training with “real” n-best list output from the baseline recognizer. We find that methods based on extracting phrasal cohorts - similar to methods from machine translation for extracting phrase tables - yielded the largest gains of our three methods, achieving over half of the WER reduction of the fully supervised methods.

[1] Sanjeev Khudanpur,et al. Self-supervised discriminative training of statistical language models , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[2] Michael Collins,et al. Trigger-Based Language Modeling using a Loss-Sensitive Perceptron Algorithm , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[3] Ebru Arisoy,et al. Discriminative Language Modeling With Linguistic and Statistically Derived Features , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[4] Andreas Stolcke,et al. Web resources for language modeling in conversational speech recognition , 2007, TSLP.

[5] Panayiotis G. Georgiou,et al. Automatic speech recognition system channel modeling , 2010, INTERSPEECH.

[6] Sanjeev Khudanpur,et al. Unsupervised Discriminative Language Model Training for Machine Translation using Simulated Confusion Sets , 2010, COLING.

[7] Brian Roark,et al. Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation , 2011, EMNLP.

[8] Izhak Shafran,et al. Learning a Discriminative Weighted Finite-State Transducer for Speech Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[9] Masafumi Nishimura,et al. Acoustically discriminative training for language models , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10] John R. Hershey,et al. Word confusability - measuring hidden Markov model similarity , 2007, INTERSPEECH.

[11] Izhak Shafran,et al. Corrective Models for Speech Recognition of Inflected Languages , 2006, EMNLP.

[12] Peder A. Olsen,et al. Theory and practice of acoustic confusability , 2002, Comput. Speech Lang..

[13] Masafumi Nishimura,et al. Training of error-corrective model for ASR without using audio data , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14] Brian Roark,et al. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm , 2004, ACL.

[15] Brian Kingsbury,et al. The IBM Attila speech recognition toolkit , 2010, 2010 IEEE Spoken Language Technology Workshop.

[16] Sanjeev Khudanpur,et al. WEB-derived pronunciations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17] Brian Roark,et al. Discriminative n-gram language modeling , 2007, Comput. Speech Lang..

[18] Eric Fosler-Lussier,et al. Discriminative language modeling using simulated ASR errors , 2010, INTERSPEECH.

[19] Brian Roark,et al. Discriminative Syntactic Language Modeling for Speech Recognition , 2005, ACL.