A new approach to utterance verification based on neighborhood information in model space

We propose to use neighborhood information in model space to perform utterance verification (UV). At first, we present a nested-neighborhood structure for each underlying model in model space and assume the underlying model's competing models sit in one of these neighborhoods, which is used to model alternative hypothesis in UV. Bayes factors (BF) is first introduced to UV and used as a major tool to calculate confidence measures based on the above idea. Experimental results in the Bell Labs communicator system show that the new method has dramatically improved verification performance when verifying correct words against mis-recognized words in the recognizer's output, relatively more than 20% reduction in equal error rate (EER) when comparing with the standard approach based on likelihood ratio testing and anti-models.

[1]  Keikichi Hirose,et al.  A minimax search algorithm for robust continuous speech recognition , 2000, IEEE Trans. Speech Audio Process..

[2]  Hermann Ney,et al.  A comparison of word graph and n-best list based confidence measures , 1999, EUROSPEECH.

[3]  A. O'Hagan,et al.  Fractional Bayes factors for model comparison , 1995 .

[4]  Chin-Hui Lee,et al.  Utterance verification based on neighborhood information and Bayes factors , 2002, INTERSPEECH.

[5]  Biing-Hwang Juang,et al.  Discriminative utterance verification for connected digits recognition , 1995, IEEE Trans. Speech Audio Process..

[6]  Thomas Schaaf,et al.  Confidence measures for spontaneous speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Thomas Schaaf,et al.  Estimating confidence using word lattices , 1997, EUROSPEECH.

[8]  Keikichi Hirose,et al.  Improving Viterbi Bayesian predictive classification via sequential bayesian learning in robust speech recognition , 1999, Speech Commun..

[9]  Larry Gillick,et al.  A probabilistic approach to confidence estimation and evaluation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Chin-Hui Lee,et al.  Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition , 1996, IEEE Trans. Speech Audio Process..

[11]  Hsiao-Wuen Hon,et al.  Word-based acoustic confidence measures for large-vocabulary speech recognition , 1998, ICSLP.

[12]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[13]  Keikichi Hirose,et al.  Robust speech recognition based on a Bayesian prediction approach , 1999, IEEE Trans. Speech Audio Process..

[14]  M. Aitkin Posterior Bayes Factors , 1991 .

[15]  Li Deng,et al.  A Bayesian approach to the verification problem: applications to speaker verification , 2001, IEEE Trans. Speech Audio Process..

[16]  Chin-Hui Lee,et al.  A minimax classification approach with application to robust speech recognition , 1993, IEEE Trans. Speech Audio Process..

[17]  Biing-Hwang Juang,et al.  Flexible speech understanding based on combined key-phrase detection and verification , 1998, IEEE Trans. Speech Audio Process..

[18]  Myoung-Wan Koo,et al.  Speech recognition and utterance verification based on a generalized confidence score , 2001, IEEE Trans. Speech Audio Process..

[19]  Biing-Hwang Juang,et al.  Key-phrase detection and verification for flexible speech understanding , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[20]  Hong-Kwang Jeff Kuo,et al.  Dialogue management in the Bell Labs communicator system , 2000, INTERSPEECH.

[21]  Mitch Weintraub,et al.  Neural-network based measures of confidence for word recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Sheryl R. Young,et al.  Detecting misrecognitions and out-of-vocabulary words , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Stephen J. Cox,et al.  Confidence measures for the SWITCHBOARD database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[24]  Chin-Hui Lee,et al.  A data selection strategy for utterance verification in continuous speech recognition , 2001, INTERSPEECH.

[25]  Timothy J. Hazen,et al.  Word and phone level acoustic confidence scoring , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[26]  Lin Lawrence Chase,et al.  Word and acoustic confidence annotation for large vocabulary speech recognition , 1997, EUROSPEECH.