Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing unreliable noise corrupted components of a spectrographic representation of the noisy speech and performing recognition with the remaining reliable components. Conventional classifier compensation methods modify the recognition system to work with the incomplete representation so obtained. This constrains them to perform recognition using spectrographic features which are known to be suboptimal to cepstra. In previous work we have proposed an alternative feature-compensation approach whereby the unreliable components are replaced by estimates derived from the reliable components and the known statistics of clean speech. In this paper we perform a detailed comparison of various aspects of classifier -based and feature-based compensation methods. We show that although the classifier -based compensation methods are superior when recognition is performed with spectrographic features, feature-based compensation methods provide better recognition performance overall, since cepstra derived from the reconstructed spectrogram can now be used for recognition. In addition, they have the added advantages of being computationally less expensive and not requiring modificati on of the recognizer.
[1]
Richard M. Stern,et al.
Reconstruction of damaged spectrographic features for robust speech recognition
,
2000,
INTERSPEECH.
[2]
Jon Barker,et al.
Soft decisions in missing data techniques for robust automatic speech recognition
,
2000,
INTERSPEECH.
[3]
Phil D. Green,et al.
State based imputation of missing data for robust speech recognition and speech enhancement
,
1999,
EUROSPEECH.
[4]
S. Boll,et al.
Suppression of acoustic noise in speech using spectral subtraction
,
1979
.
[5]
Richard M. Stern,et al.
Inference of missing spectrographic features for robust speech recognition
,
1998,
ICSLP.
[6]
Richard M. Stern,et al.
Classifier-based mask estimation for missing feature methods of robust speech recognition
,
2000,
INTERSPEECH.