Robust speech recognition using missing feature theory and vector quantization

This paper addresses the problem of speech recognition in noisy conditions when low complexity is required like in embedded systems. In such systems, vector quantization is generally used to reduce the complexity of the recognition systems (e.g. HMMs). A novel approach for vector quantization based on the missing data theory is proposed. This approach allows to increase the robustness of the system against the noise perturbations with only a small increase of the computational requirements. The proposed algorithm is composed of two parts. The first part consists in dividing the spectral temporal features of the noisy signal into two subspaces: the unreliable (or missing) features and the reliable (or present) features. The second part of the proposed approach consists in defining a robust distance measure for vector quantization that compensates for the unreliable features. The proposed approach obtains similar results in noisy conditions than a more classical approach that consists in adapting the codebook of the vector quantization to the noisy conditions using model compensation. However the computation requirements are lower in the proposed approach and it is more suitable for a low complexity speech recognition system.

[1]  Jon Barker,et al.  Soft decisions in missing data techniques for robust automatic speech recognition , 2000, INTERSPEECH.

[2]  Phil D. Green,et al.  Missing data techniques for robust speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Andrzej Drygajlo,et al.  Missing feature theory and probabilistic estimation of the clean components for robust speech recognition , 1999 .

[4]  Andrzej Drygajlo,et al.  Introduction of a reliability measure in missing data approach for robust speech recognition , 2000, 2000 10th European Signal Processing Conference.

[5]  Andrzej Drygajlo,et al.  Missing feature theory and probabilistic estimation of clean speech components for robust speech recognition , 1999, EUROSPEECH.

[6]  Richard P. Lippmann,et al.  ROBUST SPEECH RECOGNITION WITH INTERRUPTIONS, AND NOISE:': , 1997 .

[7]  Phil D. Green,et al.  Handling missing data in speech recognition , 1994, ICSLP.

[8]  Andrzej Drygajlo,et al.  Statistical estimation of unreliable features for robust speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Andrzej Drygajlo,et al.  Speaker verification in noisy environments with combined spectral subtraction and missing feature theory , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  Phil D. Green,et al.  State based imputation of missing data for robust speech recognition and speech enhancement , 1999, EUROSPEECH.

[11]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[12]  Phil D. Green,et al.  Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: an integrated study , 1999, EUROSPEECH.

[13]  A. Drygajlo,et al.  Use of Generalized Spectral Subtraction and Missing Feature Compensation for Robust Speaker Verification , 1998 .

[14]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .