Reliable bands guided similarity measure for noise-robust speech recognition

Under noisy conditions, due to the redundancy of speech signal, there are some spectral bands (Reliable Bands) whose local SNR’s are high enough to be used effectively by a recognizer. A novel, phonetically motivated Reliable Bands Guided similarity measure (RBG measure) is proposed in this study. It has the following features. Firstly, for reference spectrum, frequency bands which have larger absolute energy or sharper spectral peaks are marked as reliable bands. They are to be given more weight than the other bands in the definition of the RBG measure. Secondly, within each reliable band, similarity between formant positions and formant shapes of test spectrum and reference spectrum is explicitly modelled. Lastly, the measure can automatically emphasize spectral bands whose amplitudes change abruptly, which normally contain more reliable dynamic features of the speech signal. Both the RBG measure and the Parallel Model Combination (PMC) method are tested on a speaker-independent, continuous Mandarin digit string recognition task, under 15 noisy conditions. Noises are drawn from the NOISEX92 database. The RBG measure shows an average 4.22% word accuracy score below the PMC method above 0 dB. However, it outperforms the PMC method by 8.82% at 0 dB. More importantly,the RBG measure does not rely on accurate background noise modeling, which is a difficult task in itself.