论文信息 - Language model capitalization

Language model capitalization

In many speech recognition systems, capitalization is not an inherent component of the language model: training corpora are down cased, and counts are accumulated for sequences of lower-cased words. This level of modeling is sufficient for automating voice commands or otherwise enabling users to communicate with a machine, but when the recognized speech is intended to be read by a person, such as in email dictation or even some web search applications, the lack of capitalization of the user's input can add an extra cognitive load on the reader. For these cases, speech recognition systems often post-process the recognized text to restore capitalization. We propose folding capitalization directly in the recognition language model. Instead of post-processing, we take the approach that language should be represented in all its richness, with capitalization, diacritics, and other special symbols. With that perspective, we describe a strategy to handle poorly capitalized or uncapitalized training corpora for language modeling. The resulting recognition system retains the accuracy/latency/memory tradeoff of our uncapitalized production recognizer, while providing properly cased outputs.

Françoise Beaufays | Brian Strope | B. Strope | F. Beaufays

[1] Michiel Bacchiani,et al. Restoring punctuation and capitalization in transcribed speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Cyril Allauzen,et al. Bayesian Language Model Interpolation for Mobile Speech Input , 2011, INTERSPEECH.

[4] Alex Acero,et al. Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[5] Navdeep Jaitly,et al. Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition , 2012, INTERSPEECH.

[6] Fernando Batista,et al. Recovering capitalization and punctuation marks for automatic speech recognition: Case study for Portuguese broadcast news , 2008, Speech Commun..