Training and search algorithms for an interactive wordspotting system

Algorithms for a speaker-dependent wordspotting system based on hidden Markov models (HMMs) are described. The system allows a user to specify keywords dynamically and to train the associated HMMs via a single repetition of a keyword. Nonkeyword speech is modeled using an HMM trained from a prerecorded sample of continuous speech. The wordspotter is intended for interactive applications, such as the editing of voice mail or mixed-media documents, and for keyword indexing in audio or video recordings. The forward-backward search algorithm used in the wordspotter is compared with the Viterbi decoder on the basis of speed and accuracy. In addition, an algorithm for speaker adaptation is described which allows indexing by a user into another talker's speech.<<ETX>>

[1]  S. Furui,et al.  Unsupervised speaker adaptation method based on hierarchical spectral clustering , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[4]  Michael Picheny,et al.  Acoustic Markov models used in the Tangora speech recognition system , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  Richard Rose,et al.  A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  Richard C. Rose,et al.  Techniques for robust word spotting in continuous speech messages , 1991, EUROSPEECH.

[7]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[8]  James C. Bezdek,et al.  Optimal Fuzzy Partitions: A Heuristic for Estimating the Parameters in a Mixture of Normal Distributions , 1975, IEEE Transactions on Computers.

[9]  James C. Spohrer,et al.  Partial traceback and dynamic programming , 1982, ICASSP.

[10]  W. Russell,et al.  Continuous hidden Markov modeling for speaker-independent word spotting , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[11]  Patti Price,et al.  The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.