An LPC-based spectral similarity measure for speech recognition in the presence of co-channel speech interference

The authors present an alternative to the enhancement paradigm for cochannel speech recognition, in which target-interference separation and target recognition occur simultaneously, driven by a model of the recognition vocabulary. The method is based on an LPC (linear predictive coding) spectral similarity measure which allows a reference spectrum to match only a subset of the poles of a noisy input spectrum, rather than requiring a whole-spectrum comparison. A preliminary evaluation of the proposed method in a speaker-trained isolated-digit recognition task suggests a reduction in error rate of 50-70% at low target-interference ratios, as compared to a conventional whole-spectrum similarity measure.<<ETX>>

[1]  Chin-Hui Lee,et al.  Speech recognition under additive noise , 1984, ICASSP.

[2]  D. Van Compernolle Increased noise immunity in large vocabulary speech recognition with the aid of spectral subtraction , 1987, ICASSP.

[3]  Aaron E. Rosenberg,et al.  Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[4]  Yariv Ephraim,et al.  A linear predictive front-end processor for speech recognition in noisy environments , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  N. Sedgwick,et al.  Noise compensation for speech recognition using probabilistic models , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Brian A. Hanson,et al.  Spectral slope distance measures with linear prediction analysis for word recognition in noise , 1987, IEEE Trans. Acoust. Speech Signal Process..

[7]  Biing-Hwang Juang,et al.  A family of distortion measures base upon projection operation for robust speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[8]  W. Fisher,et al.  An acoustic‐phonetic data base , 1987 .

[9]  Donald G. Childers,et al.  Co--Channel speech separation , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Biing-Hwang Juang,et al.  On the use of bandpass liftering in speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[11]  Richard P. Lippmann,et al.  Dynamic adaptation of Hidden Markov models for robust isolated-word speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[12]  Frank K. Soong,et al.  A frequency-weighted Itakura spectral distortion measure and its application to speech recognition in noise , 1988, IEEE Trans. Acoust. Speech Signal Process..

[13]  Mitch Weintraub The GRASP sound separation system , 1984, ICASSP.

[14]  S. Boll,et al.  Techniques for suppression of an interfering talker in co-channel speech , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  H. Matsumoto,et al.  Comparative study of various spectrum matching measures on noise robustness , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Mitchel Weintraub,et al.  A computational model for separating two simultaneous talkers , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  J. Baker,et al.  Optimal and suboptimal training strategies for automatic speech recognition in noise, and the effects of adaptation on performance , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  R. G. Leonard,et al.  A database for speaker-independent digit recognition , 1984, ICASSP.

[19]  S. T. Alexander Adaptive reduction of interfering speaker noise using the least mean squares algorithm , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Oded Ghitza Robustness against noise: The role of timing-synchrony measurement , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.