Advances in confidence measures for large vocabulary

This paper addresses the correct choice and combination of confidence measures in large vocabulary speech recognition tasks. We classify single words within continuous as well as large vocabulary utterances into two categories: utterances within the vocabulary which are recognized correctly, and other utterances, namely misrecognized utterances or (less frequent) out-of-vocabulary (OOV). To this end, we investigate the classification error rate (CER) of several classes of confidence measures and transformations. In particular, we employed data-independent and data-dependent measures. The transformations we investigated include mapping to single confidence measures and linear combinations of these measures. These combinations are computed by means of neural networks trained with Bayes-optimal, and with Gardner-Derrida-optimal criteria. Compared to a recognition system without confidence measures, the selection of (various combinations of) confidence measures, the selection of suitable neural network architectures and training methods, continuously improves the CER.

[1]  J J Hopfield,et al.  Learning algorithms and probability distributions in feed-forward and feed-back networks. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Mitch Weintraub,et al.  Neural-network based measures of confidence for word recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Hervé Bourlard,et al.  Optimizing recognition and rejection performance in wordspotting systems , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Ralf Schlüter,et al.  Using word probabilities as confidence measures , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5]  Andreas Wendemuth,et al.  Combination of confidence measures in isolated word recognition , 1998, ICSLP.

[6]  A. Wendemuth LEARNING THE UNLEARNABLE , 1995 .

[7]  Francisco Javier Caminero Gil,et al.  On-line garbage modeling with discriminant analysis for utterance verification , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8]  Bernhard Rüber,et al.  Obtaining confidence measures from sentence probabilities , 1997, EUROSPEECH.

[9]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[10]  Thomas Schaaf,et al.  Confidence measures for spontaneous speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Thomas Schaaf,et al.  Estimating confidence using word lattices , 1997, EUROSPEECH.

[12]  Stephen J. Cox,et al.  Confidence measures for the SWITCHBOARD database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[13]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[14]  Peter Beyerlein,et al.  Modelling and decoding of crossword context dependent phones in the Philips large vocabulary continuous speech recognition system , 1997, EUROSPEECH.