Detecting and locating key words in continuous speech using linear predictive coding
暂无分享,去创建一个
This paper considers the problem of automatically detecting and locating key words in a stream of continuous speech. The system described here is a template-matching procedure which uses as its basic waveform features a set of linear prediction coefficients. The similarity measure between a segment of the template and a segment of the incoming speech stream is taken to be a ratio of minimum prediction residuals. This similarity measure is used in conjunction with a dynamic-programming time-warp algorithm developed by Bridle and a novel method for using multiple templates. Using templates and incoming speech spoken by the same person in a quiet room, an accuracy in excess of 99 percent was obtained. Further experiments are described which explore cross-speaker word spotting and the effects of noise on system performance. The results of these experiments suggest that the technique described in this paper could well form the basis for a practical system.
[1] F. Itakura,et al. Minimum prediction residual principle applied to speech recognition , 1975 .
[2] Hiroaki Sakoe,et al. A Dynamic Programming Approach to Continuous Speech Recognition , 1971 .
[3] G. L. Clapper. Automatic word recognition , 1971, IEEE Spectrum.
[4] J. Makhoul,et al. Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.