Statistical estimation of unreliable features for robust speech recognition

This paper addresses the problem of robust speech recognition in noisy conditions in the framework of hidden Markov models (HMMs) and missing feature techniques. It presents a new statistical approach to detection and estimation of unreliable features based on a probabilistic measure and Gaussian mixture model (GMM). In the estimation process, the GMM is compensated using parameters of the statistical model of additive background noise. The GMM means are used to replace the unreliable features. The GMM based technique is less complex than the corresponding HMM based estimation and gives similar improvement in the recognition performance. Once unreliable features are replaced by the estimated clean speech features, the entire set of spectral features can be transformed to the other feature domain characterized by higher baseline recognition rate (e.g. MFCCs) for final recognition using continuous density hidden Markov models (CDHMMs) with diagonal covariance matrices.

[1]  Andrzej Drygajlo,et al.  Spectral subtraction and missing feature modeling for speaker verification , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[2]  Andrzej Drygajlo,et al.  Missing feature theory and probabilistic estimation of the clean components for robust speech recognition , 1999 .

[3]  Phil D. Green,et al.  Missing data techniques for robust speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Andrzej Drygajlo,et al.  Missing feature theory and probabilistic estimation of clean speech components for robust speech recognition , 1999, EUROSPEECH.

[5]  Andrzej Drygajlo,et al.  Speaker verification in noisy environments with combined spectral subtraction and missing feature theory , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  Phil D. Green,et al.  Some solution to the missing feature problem in data classification, with application to noise robust ASR , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Phil D. Green,et al.  Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: an integrated study , 1999, EUROSPEECH.

[8]  A. Drygajlo,et al.  Use of Generalized Spectral Subtraction and Missing Feature Compensation for Robust Speaker Verification , 1998 .

[9]  Phil D. Green,et al.  Handling missing data in speech recognition , 1994, ICSLP.

[10]  Richard Lippmann,et al.  Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise KN-37 , 1997, EUROSPEECH.

[11]  Andrzej Drygajlo,et al.  Rehaussement par soustraction spectrale et compensation des parametres manquants pour la reconnaissance robuste du locuteur et de la parole , 1998 .

[12]  Phil D. Green,et al.  State based imputation of missing data for robust speech recognition and speech enhancement , 1999, EUROSPEECH.