A new method is presented for enhancing the contrast between speech and background noise, which might be useful as a preprocessor for speech recognizers or even for improving the audible speech signal. As in the brain a representation of the acoustic modulation spectrum exists, it is suggested to use similar methods for signal separation. Previous models have shown the possibility of selecting frequency channels based on modulation frequency (MF) analysis, if the signals to be separated have sufficiently different modulations. Instead of selecting or weighting frequency channels, in the present approach the MF spectrum is processed directly. This seems to be less prone to produce ‘‘musical noise.’’ The dominant common MF (mostly corresponding to F0) is enhanced by an adaptive bandpass filter in each frequency channel, and the low‐MF components are artificially reconstructed before transforming back. No explicit F0 measurement is required. Nonstationary sounds (detected by the width of the low‐MF part) are...
[1]
Gerald Langner,et al.
Coding of temporal patterns in the central auditory nervous system
,
1988
.
[2]
Fisseha Mekuria,et al.
A filtersank based on physiologically measured characteristics in an auditory model for speech signal processing
,
1993,
EUROSPEECH.
[3]
Herbert Reininger,et al.
Strategies for reducing the complexity of a RNN based speech recognizer
,
1996,
1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[4]
Birger Kollmeier,et al.
Combining Monaural Noise Reduction Algorithms and Perceptive Preprocessing for Robust Speech Recognition
,
1999
.