Assessing Level-Dependent Segmental Contribution to the Intelligibility of Speech Processed by Single-Channel Noise-Suppression Algorithms

Most existing single-channel noise-suppression algorithms cannot improve speech intelligibility for normal-hearing listeners; however, the underlying reason for this performance deficit is still unclear. Given that various speech segments contain different perceptual contributions, the present work assesses whether the intelligibility of noisy speech can be improved when selectively suppressing its noise at high-level (vowel-dominated) or middle-level (containing vowelconsonant transitions) segments by existing single-channel noise-suppression algorithms. The speech signal was corrupted by speech-spectrum shaped noise and two-talker babble masker, and its noisy highor middle-level segments were replaced by their noise-suppressed versions processed by four types of existing single-channel noise-suppression algorithms. Experimental results showed that performing segmental noise-suppression at highor middle-level led to decreased intelligibility relative to noisy speech. This suggests that the lack of intelligibility improvement by existing noisesuppression algorithms is also present at segmental level, which may account for the deficit traditionally observed at full-sentence level.

[1]  Philipos C. Loizou,et al.  A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Gibak Kim,et al.  Gain-induced speech distortions and the absence of intelligibility benefit with existing noise-reduction algorithms. , 2011, The Journal of the Acoustical Society of America.

[4]  Fei Chen,et al.  Assessing the perceptual contributions of vowels and consonants to Mandarin sentence intelligibility. , 2013, The Journal of the Acoustical Society of America.

[5]  Sha Liu,et al.  Development of the Mandarin Hearing in Noise Test (MHINT) , 2007, Ear and hearing.

[6]  Philipos C. Loizou,et al.  Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Fei Chen,et al.  Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise. , 2012, The Journal of the Acoustical Society of America.

[8]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[9]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[10]  Yonghong Yan,et al.  Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English. , 2011, The Journal of the Acoustical Society of America.

[11]  Philipos C. Loizou,et al.  Impact of SNR and gain-function over- and under-estimation on speech intelligibility , 2012, Speech Commun..

[12]  James M Kates,et al.  Coherence and the speech intelligibility index. , 2004, The Journal of the Acoustical Society of America.

[13]  Yi Hu,et al.  A generalized subspace approach for enhancing speech corrupted by colored noise , 2003, IEEE Trans. Speech Audio Process..

[14]  Yi Hu,et al.  A comparative intelligibility study of single-microphone noise reduction algorithms. , 2007, The Journal of the Acoustical Society of America.

[15]  Daniel Fogerty,et al.  Perceptual contributions of the consonant-vowel boundary to sentence intelligibility. , 2009, The Journal of the Acoustical Society of America.