Word and phone level acoustic confidence scoring

This paper presents a word level confidence scoring technique based on a combination of multiple features extracted from the output of a phonetic classifier. The goal of this research was to develop a robust confidence measure based strictly on acoustic information. This research focused on methods for augmenting standard log likelihood ratio techniques with additional information to improve the robustness of the acoustic confidence scores for word recognition tasks. The most successful approach utilized a Fisher linear discriminant projection to reduce a set of acoustic features, extracted from phone level classification results, to a single dimension confidence score. The experiments in this paper were implemented within the JUPITER weather information system. The paper presents results indicating that the technique achieved significant improvements over standard log likelihood ratio techniques for confidence scoring.

[1]  Mitch Weintraub,et al.  Neural-network based measures of confidence for word recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Simo O. Kamppari Word and phone level acoustic confidence scoring for speech understanding systems , 2000 .

[3]  Lin Lawrence Chase,et al.  Word and acoustic confidence annotation for large vocabulary speech recognition , 1997, EUROSPEECH.

[4]  Victor Zue,et al.  JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..

[5]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[6]  Herbert Gish,et al.  Improved estimation, evaluation and applications of confidence measures for speech recognition , 1997, EUROSPEECH.

[7]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[8]  James R. Glass,et al.  Real-time telephone-based speech recognition in the Jupiter domain , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9]  James R. Glass,et al.  A probabilistic framework for feature-based speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.