论文信息 - A two pass classifier for utterance rejection in keyword spotting

A two pass classifier for utterance rejection in keyword spotting

A classifier for utterance rejection in a hidden Markov model (HMM) based speech recognizer is presented. This classifier, termed the two-pass classifier, is a postprocessor to the HMM recognizer, and consists of a two-stage discriminant analysis. The first stage employs the generalized probabilistic descent (GPD) discriminative training framework, while the second stage performs linear discrimination combining the output of the first stage with HMM likelihood scores. In this fashion the classification power of the HMM is combined with that of the GPD stage which is specifically designed for keyword/nonkeyword classification. Experimental results show that, on two separate databases, the two-pass classifier significantly outperforms a single-pass classifier based solely on the HMM likelihood scores.<<ETX>>

Jay G. Wilpon | Rafid A. Sukkar | J. Wilpon | R. Sukkar

[1] Chin-Hui Lee,et al. Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[2] Shigeru Katagiri,et al. A new connected word recognition algorithm based on HMM/LVQ segmentation and LVQ classification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[3] Richard Rose,et al. A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4] Chin-Hui Lee,et al. Segmental GPD training of HMM based speech recognizer , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Steve Austin,et al. Speech recognition using segmental neural nets , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Jay G. Wilpon,et al. Discriminative analysis for feature reduction in automatic speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7] Chin-Hui Lee,et al. Improvements in connected digit recognition using higher order spectral and energy features , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8] Baruch Mazor,et al. Continuous word spotting for applications in telecommunications , 1992, ICSLP.