Spectral subtraction and missing feature modeling for speaker verification

This paper addresses the problem of robust text-independent speaker verification when some of the features for the target signal are heavily masked by noise. In the framework of Gaussian mixture models (GMMs), a new approach based on the spectral subtraction technique and the statistical missing feature compensation is presented. The identity of spectral features missing due to noise masking is provided by the spectral subtraction algorithm. Consequently, the statistical missing feature compensation dynamically modifies the probability computations performed in GMM recognizers. The proposed algorithm uses a variation of the generalized spectral subtraction and incorporates in it a criterion based on masking properties of the human auditory system. The originality of the algorithm resides in the fact that instead of using fixed parameters for the noise reduction and missing feature compensation, the noise masking threshold is used to control the enhancement and model compensation processes adaptively, frame-by-frame, hence helping to find the best tradeoff.

[1]  Andrzej Drygajlo,et al.  Speaker verification in noisy environments with combined spectral subtraction and missing feature theory , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[3]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[4]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[5]  Andrzej Drygajlo,et al.  Perceptual speech coding using time and frequency masking constraints , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Javier Ortega-Garcia,et al.  Overview of speech enhancement techniques for automatic speaker recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  Nathalie Virag Speech enhancement based on masking properties of the auditory system , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Richard Lippmann,et al.  Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise KN-37 , 1997, EUROSPEECH.

[9]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[10]  A. Drygajlo,et al.  Statistical Modeling and Missing Feature Compensation for Noisy Speech In Forensic Speaker Recognition , 1998 .