Tackling the combined effects of reverberation and masking noise using ideal channel selection.

PURPOSE In this article, a new signal-processing algorithm is proposed and evaluated for the suppression of the combined effects of reverberation and noise. METHOD The proposed algorithm decomposes, on a short-term basis (every 20 ms), the reverberant stimuli into a number of channels and retains only a subset of the channels satisfying a signal-to-reverberant ratio (SRR) criterion. The construction of this criterion assumes access to a priori knowledge of the target (anechoic) signal, and the aim of this study was to assess the full potential of the proposed channel-selection algorithm, assuming that this criterion could be estimated accurately. Listening tests with normal-hearing listeners were conducted to assess the performance of the proposed algorithm in highly reverberant conditions (T(60) = 1.0 s), which included additive noise at 0 and 5 dB signal-to-noise ratios (SNRs). RESULTS A substantial gain in intelligibility was obtained in both reverberant and combined reverberant and noise conditions. The mean intelligibility scores improved by 44 and 33 percentage points at 0 and 5 dB SNR reverberation + noise conditions. Feature analysis of the consonant confusion matrices revealed that the transmission of voicing information was most negatively affected, followed by manner and place of articulation. CONCLUSIONS The proposed algorithm produced substantial gains in intelligibility, and this benefit was attributed to the ability of the proposed SRR criterion to detect accurately voiced or unvoiced boundaries. It was postulated that detection of those boundaries is critical for better perception of voicing information and manner of articulation.

[1]  A. Neuman,et al.  Children's perception of speech in reverberation. , 1983, The Journal of the Acoustical Society of America.

[2]  Kostas Kokkinakis,et al.  A channel-selection criterion for suppressing reverberation in cochlear implants. , 2011, The Journal of the Acoustical Society of America.

[3]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[4]  A K Nábĕlek,et al.  Reception of consonants in a classroom as affected by monaural and binaural listening, noise, reverberation, and hearing aids. , 1974, The Journal of the Acoustical Society of America.

[5]  Kenneth N Stevens,et al.  Toward a model for lexical access based on acoustic landmarks and distinctive features. , 2002, The Journal of the Acoustical Society of America.

[6]  Guy J. Brown,et al.  Techniques for handling convolutional distortion with 'missing data' automatic speech recognition , 2004, Speech Commun..

[7]  Peter F. Assmann,et al.  The Perception of Speech Under Adverse Conditions , 2004 .

[8]  A. Nabelek,et al.  Effect of noise and reverberation on binaural and monaural word identification by subjects with various audiograms. , 1981, Journal of speech and hearing research.

[9]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[10]  P. C. Pandey,et al.  The Journal of the Acoustical Society of America , 1939 .

[11]  Raymond L. Goldsworthy,et al.  Analysis of speech-based Speech Transmission Index methods with implications for nonlinear operations. , 2004, The Journal of the Acoustical Society of America.

[12]  Patrick A. Naylor,et al.  Speech Dereverberation , 2010 .

[13]  DeLiang Wang,et al.  A two-stage algorithm for one-microphone reverberant speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[15]  H S Colburn,et al.  The precedence effect. , 1999, The Journal of the Acoustical Society of America.

[16]  Joshua J. Hajicek,et al.  Combined Effects of Noise and Reverberation on Speech Recognition Performance of Normal-Hearing Children and Adults , 2010, Ear and hearing.

[17]  Philipos C Loizou,et al.  The contribution of obstruent consonants and acoustic landmarks to speech recognition in noise. , 2008, The Journal of the Acoustical Society of America.

[18]  T R Letowski,et al.  Vowel confusions of hearing-impaired listeners under reverberant and nonreverberant conditions. , 1985, The Journal of speech and hearing disorders.

[19]  J. Pickett,et al.  Monaural and binaural speech perception through hearing aids under noise and reverberation with normal and hearing-impaired listeners. , 1974, Journal of speech and hearing research.

[20]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[21]  B. J. Bailey Speech Science Primer: Physiology, Acoustics, and Perception of Speech , 1981 .

[22]  P. Loizou,et al.  Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction. , 2008, The Journal of the Acoustical Society of America.

[23]  A. Nabelek,et al.  Reverberant overlap- and self-masking in consonant identification. , 1989, The Journal of the Acoustical Society of America.

[24]  Yi Hu,et al.  Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. , 2009, The Journal of the Acoustical Society of America.

[25]  A K Nábĕlek,et al.  Perception of consonants in reverberation by native and non-native listeners. , 1984, The Journal of the Acoustical Society of America.

[26]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[27]  DeLiang Wang,et al.  A Supervised Learning Approach to Monaural Segregation of Reverberant Speech , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Jacob Benesty,et al.  Springer Handbook of Speech Processing and Communication , 2007 .

[29]  Gerald A. Studebaker,et al.  Acoustical Factors Affecting Hearing Aid Performance , 1992 .

[30]  Daniel P. W. Ellis,et al.  Evaluating Source Separation Algorithms With Reverberant Speech , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Marc Moonen,et al.  Speech enhancement with multichannel Wiener filter techniques in multimicrophone binaural hearing aids. , 2009, The Journal of the Acoustical Society of America.

[32]  R. Fay,et al.  Speech Processing in the Auditory System , 2010, Springer Handbook of Auditory Research.

[33]  B. Kollmeier,et al.  Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction. , 1994, The Journal of the Acoustical Society of America.

[34]  Fei Chen,et al.  Analysis of a simplified normalized covariance measure based on binary weighting functions for predicting the intelligibility of noise-suppressed speech. , 2010, The Journal of the Acoustical Society of America.