论文信息 - Mixture Pruning and Roughening for Scalable Acoustic Models

Mixture Pruning and Roughening for Scalable Acoustic Models

In an automatic speech recognition system using a tied-mixture acoustic model, the main cost in CPU time and memory lies not in the evaluation and storage of Gaussians themselves but rather in evaluating the mixture likelihoods for each state output distribution. Using a simple entropy-based technique for pruning the mixture weight distributions, we can achieve a significant speedup in recognition for a 5000-word vocabulary with a negligible increase in word error rate. This allows us to achieve real-time connected-word dictation on an ARM-based mobile device.

Alexander I. Rudnicky | David Huggins-Daines

[1] Xuedong Huang,et al. Semi-continuous hidden Markov models for speech recognition , 1989 .

[2] Qiang Huo,et al. On-line Bayes adaptation of SCHMM parameters for speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[3] Imre Kiss,et al. Comparison of low footprint acoustic modeling techniques for embedded ASR systems , 2005, INTERSPEECH.

[4] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.

[5] Monika Woszczyna,et al. Fast speaker independent large vocabulary continuous speech recognition , 1998 .

[6] Alexander I. Rudnicky,et al. Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.