Rejection using rank statistics based on HMM state shortlists

We study a measure of confidence in the speech recognizer output based on a rank-order probability model of HMM state likelihoods. The motivation for rank models is based on the conjecture that statistics based on ranks are likely to be more robust than those based on the likelihood values, especially when the test and training distributions are mismatched. We investigate a number of different issues that arise in the development of rank models. We test the proposed rank-order model on two ASR rejection tasks: a combination of the log-likelihood ratio and rank order probability yields relative reductions of the equal error rates of 31% and 8% (for the two tasks, respectively) over the log-likelihood ratio alone.

[1]  Biing-Hwang Juang,et al.  Discriminative utterance verification for connected digits recognition , 1995, IEEE Trans. Speech Audio Process..

[2]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[3]  Andrej Ljolje Multiple task-domain acoustic models , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Hervé Bourlard,et al.  Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems , 1998, ICSLP.

[5]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[6]  Christophe Ris,et al.  Use of acoustic prior information for confidence measure in ASR applications , 2001, INTERSPEECH.

[7]  Giuseppe Riccardi,et al.  Acoustic and word lattice based algorithms for confidence scores , 2002, INTERSPEECH.

[8]  Chin-Hui Lee,et al.  Utterance verification of keyword strings using word-based minimum verification error (WB-MVE) training , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[9]  Chin-Hui Lee,et al.  Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition , 1996, IEEE Trans. Speech Audio Process..

[10]  Eduardo Lleida,et al.  Efficient decoding and training procedures for utterance verification in continuous speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[11]  Michael Picheny,et al.  A new confidence measure based on rank-ordering subphone scores , 1998, ICSLP.