Duration Model-Based Post-processing for the Performance Improvement of a Keyword Spotting System

In this paper, we propose a post-processing method based on a duration model to improve the performance of a keyword spotting system. The proposed duration model-based post-processing method is performed after detecting a keyword. To detect the keyword, we first combine a keyword model, a non-keyword model, and a silence model. Using the information on the detected keyword, the proposed post-processing method is then applied to determine whether or not the correct keyword is detected. To this end, we generate the duration model using Gaussian distribution in order to accommodate different duration characteristics of each phoneme. Comparing the performance of the proposed method with those of conventional anti-keyword scoring methods, it is shown that the false acceptance and the false rejection rates are reduced.

[1]  Richard Rose,et al.  A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Don H. Johnson,et al.  Symmetrizing the Kullback-Leibler Distance , 2001 .

[3]  I. King,et al.  Gaussian mixture distance for information retrieval , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[4]  Min-Je Kim,et al.  Non-Keyword Model for the Improvement of Vocabulary Independent Keyword Spotting System , 2006 .

[5]  R. Schwartz,et al.  Maximum a posteriori adaptation for large scale HMM recognizers , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.