Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech

The intelligibility of speech in adverse noise conditions can be improved by modifying the characteristics of the clean speech prior to its presentation. An effective and flexible paradigm is to select the modification by optimizing a measure of objective intelligibility. Here we apply this paradigm at the text level and optimize a measure related to the classification error probability in an automatic speech recognition system. The proposed method was applied to a simple but powerful band-energy modification mechanism under an energy preservation constraint. Subjective evaluation results provide a clear indication of a significant gain in subjective intelligibility. In contrast to existing methods, the proposed approach is not restricted to a particular modification strategy and treats the notion of optimality at a level closer to that of subjective intelligibility. The computational complexity of the method is sufficiently low to enable its use in on-line applications.

[1]  Sungyub Yoo,et al.  Speech enhancement based on transient speech information , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[2]  Nam Soo Kim,et al.  Speech reinforcement based on partial specific loudness , 2007, INTERSPEECH.

[3]  Snr Recovery NEAR END LISTENING ENHANCEMENT: SPEECH INTELLIGIBILITY IMPROVEMENT IN NOISY ENVIRONMENTS , 2006 .

[4]  Yan Tang,et al.  Energy reallocation strategies for speech enhancement in known noise conditions , 2010, INTERSPEECH.

[5]  Björn Lindblom,et al.  The effect of emphatic stress on consonant vowel coarticulation. , 2007, The Journal of the Acoustical Society of America.

[6]  Peter Vary,et al.  NEAR END LISTENING ENHANCEMENT WITH STRICT LOUDSPEAKER OUTPUT POWER CONSTRAINING , 2006 .

[7]  Sungjin Park,et al.  Speech Intelligibility Enhancement using Tunable Equalization Filter , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  J. C. Krause,et al.  Acoustic properties of naturally produced clear speech at normal speaking rates. , 1996, The Journal of the Acoustical Society of America.

[9]  Martin Cooke,et al.  Speech production modifications produced by competing talkers, babble, and stationary noise. , 2008, The Journal of the Acoustical Society of America.

[10]  Randall J. LeVeque,et al.  Finite difference methods for ordinary and partial differential equations - steady-state and time-dependent problems , 2007 .

[11]  D. Bauer Constructing Confidence Sets Using Rank Statistics , 1972 .

[12]  Martin Cooke,et al.  A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.

[13]  Philipos C. Loizou,et al.  SNR loss: A new objective measure for predicting the intelligibility of noise-suppressed speech , 2011, Speech Commun..

[14]  Yi Hu,et al.  Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. , 2009, The Journal of the Acoustical Society of America.

[15]  E. Owens,et al.  An Introduction to the Psychology of Hearing , 1997 .

[16]  R. H. Bernacki,et al.  Effects of noise on speech production: acoustic and perceptual analyses. , 1988, The Journal of the Acoustical Society of America.

[17]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.