Performance evaluation for an HMM-based keyword spotter and a large-margin based one in noisy environments

Abstract Keyword spotting refers to the detection of a limited number of given keywords in speech utterances. In this paper, first we review one of the large margin based keyword spotting approach that uses a discriminative method for training the keyword spotter. Then, we evaluate the robustness of this approach in different noisy conditions. In addition; we compare the performance of this method with an HMM-based keyword spotter -which uses a generative training method- in the same noisy conditions. The experimental results show that the large-margin based keyword spotter is more robust than HMM-based system in noisy environments.

[1]  Yoram Singer,et al.  An Online Algorithm for Hierarchical Phoneme Classification , 2004, MLMI.

[2]  Victor Zue,et al.  A segment-based wordspotter using phonetic filler models , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Yoram Singer,et al.  A Large Margin Algorithm for Speech-to-Phoneme and Music-to-Score Alignment , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Harald Höge,et al.  Efficient methods for detecting keywords in continuous speech , 1997, EUROSPEECH.

[5]  Lukás Burget,et al.  Comparison of keyword spotting approaches for informal continuous speech , 2005, INTERSPEECH.

[6]  Samy Bengio,et al.  Discriminative keyword spotting , 2009, Speech Commun..

[7]  Mehryar Mohri,et al.  Confidence Intervals for the Area Under the ROC Curve , 2004, NIPS.

[8]  Jürgen Schmidhuber,et al.  An Application of Recurrent Neural Networks to Discriminative Keyword Spotting , 2007, ICANN.

[9]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[10]  W. Russell,et al.  Continuous hidden Markov modeling for speaker-independent word spotting , 1989, International Conference on Acoustics, Speech, and Signal Processing,.