Keyword-spotting using SRI's DECIPHER large-vocabulary speech-recognition system

The application of the speaker-independent large-vocabulary CSR (continuous speech recognition) system DECIPHER to the keyword-spotting task is described. A transcription is generated for the incoming spontaneous speech by using a CSR system, and any keywords that occur in the transcription are hypothesized. It is shown that the use of improved models of nonkeyword speech with a CSR system can yield significantly improved keyword spotting performance. The algorithm for computing the score of a keyword combines information from acoustics, language, and duration. One key limitation of this approach is that keywords are only hypothesized if they are included in the Viterbi backtrace. This does not allow the system builder to operate effectively at high false alarm levels if desired. Other algorithms are being considered for hypothesizing good score keywords that are on high scoring paths. An algorithm for smoothing language model probabilities was also introduced. This algorithm combines small task-specific language model training data with large task-independent language training data, and provided a 14% reduction in test set perplexity.<<ETX>>

[1]  Mitch Weintraub,et al.  The decipher speech recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Mitch Weintraub,et al.  Performance of SRI's Decipher TM Speech Recognition System on DARPA's CSR Task , 1992, HLT.

[3]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  L. G. Miller,et al.  Improvements and applications for key word recognition using hidden Markov modeling techniques , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[6]  M. A. Bush,et al.  Training and search algorithms for an interactive wordspotting system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Richard Rose,et al.  A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8]  W. Russell,et al.  Continuous hidden Markov modeling for speaker-independent word spotting , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[9]  Lynette Hirschman,et al.  Multi-Site Data Collection for a Spoken Language Corpus , 1992, HLT.

[10]  Hy Murveit,et al.  Spontaneous Speech Effects In Large Vocabulary Speech Recognition Applications , 1992, HLT.

[11]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[12]  Mitch Weintraub,et al.  Reduced Channel Dependence for Speech Recognition , 1992, HLT.

[13]  Mitch Weintraub,et al.  Speech Recognition in SRI's Resource Management and ATIS Systems , 1991, HLT.