Keyword Search using Modified Minimum Edit Distance Measure

A popular approach for keyword search in speech files is the phone lattice search. Recently minimum edit distance (MED) has been used as a measure of similarity between strings rather than using simple string matching while searching the phone lattice for the keyword. In this paper, we propose a variation of the MED, where the substitution penalties are automatically derived from the phone confusion matrix of the recognizer, as compared to heuristic or class based penalties used earlier. The results show that the substitution penalties derived from the phone confusion matrix lead to a considerable improvement in the accuracy of the keyword search algorithm.

[1]  James R. Glass,et al.  Lexical modeling of non-native speech for automatic speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2]  Peng Yu,et al.  Vocabulary-independent search in spontaneous speech , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  Trumpington Street,et al.  A FAST LATTICE-BASED APPROACH TO VOCABULARY INDEPENDENT WORDSPOTTING , 1994 .

[5]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[6]  Karen Spärck Jones,et al.  Retrieving spoken documents by combining multiple index sources , 1996, SIGIR '96.

[7]  Steve Young,et al.  The HTK book , 1995 .

[8]  Sridha Sridharan,et al.  Dynamic match phone-lattice searches for very fast and accurate unrestricted vocabulary keyword spotting , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Karen Spärck Jones,et al.  Acoustic indexing for multimedia retrieval and browsing , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Richard Rose,et al.  A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[11]  Andrej Ljolje,et al.  Lexical Access With a Statistically-Derived Phonetic Network , 1991, HLT.

[12]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..