Use of acoustic prior information for confidence measure in ASR (automatic speech recognition) applications

In this paper, a new acoustic confidence measure of automatic speech recognition hypothesis is proposed and it is compared to approaches proposed in the literature. This approach takes into account prior information on the acoustic model performance specific to each phoneme. The new method is tested on two types of recognition errors: the out-of-vocabulary words and the errors due to additive noise. An efficient way to interpret the raw confidence measure as a correctness prior probability is also proposed in the paper.

[1]  Herbert Gish,et al.  Understanding and improving speech recognition performance through the use of diagnostic tools , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Hervé Bourlard,et al.  Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems , 1998, ICSLP.

[3]  Hong C. Leung,et al.  PhoneBook: a phonetically-rich isolated-word telephone-speech database , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[4]  Chalapathy Neti,et al.  Word-based confidence measures as a guide for stack search in speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Chin-Hui Lee,et al.  Utterance verification of keyword strings using word-based minimum verification error (WB-MVE) training , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Steve Renals,et al.  Confidence measures for hybrid HMM/ANN speech recognition , 1997, EUROSPEECH.

[9]  Alexander H. Waibel,et al.  Unsupervised training of a speech recognizer using TV broadcasts , 1998, ICSLP.

[10]  Alexander H. Waibel,et al.  Recognition of conversational telephone speech using the JANUS speech engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Richard Rose,et al.  Word Spotting from Continuous Speech Utterances , 1996 .

[12]  Michael A. Malcolm,et al.  Computer methods for mathematical computations , 1977 .

[13]  Lou Boves,et al.  Incorporating confidence measures in the Dutch train timetable information system developed in the ARISE project , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).