Speech-cue transmission by an algorithm to increase consonant recognition in noise for hearing-impaired listeners.

Consonant recognition was assessed following extraction of speech from noise using a more efficient version of the speech-segregation algorithm described in Healy, Yoho, Wang, and Wang [(2013) J. Acoust. Soc. Am. 134, 3029-3038]. Substantial increases in recognition were observed following algorithm processing, which were significantly larger for hearing-impaired (HI) than for normal-hearing (NH) listeners in both speech-shaped noise and babble backgrounds. As observed previously for sentence recognition, older HI listeners having access to the algorithm performed as well or better than young NH listeners in conditions of identical noise. It was also found that the binary masks estimated by the algorithm transmitted speech features to listeners in a fashion highly similar to that of the ideal binary mask (IBM), suggesting that the algorithm is estimating the IBM with substantial accuracy. Further, the speech features associated with voicing, manner of articulation, and place of articulation were all transmitted with relative uniformity and at relatively high levels, indicating that the algorithm and the IBM transmit speech cues without obvious deficiency. Because the current implementation of the algorithm is much more efficient, it should be more amenable to real-time implementation in devices such as hearing aids and cochlear implants.

[1]  P. Loizou,et al.  Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction. , 2008, The Journal of the Acoustical Society of America.

[2]  R C Bilger,et al.  Consonant confusions in patients with sensorineural hearing loss. , 1972, Journal of speech and hearing research.

[3]  R V Shannon,et al.  Consonant recordings for speech testing. , 1999, The Journal of the Acoustical Society of America.

[4]  Donal G. Sinex,et al.  Recognition of speech in noise after application of time-frequency masks: dependence on frequency and threshold parameters. , 2013, The Journal of the Acoustical Society of America.

[5]  DeLiang Wang,et al.  A feature study for classification-based speech separation at very low signal-to-noise ratio , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[7]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[8]  Lauren Calandruccio,et al.  Determination of the Potential Benefit of Time-Frequency Gain Manipulation , 2006, Ear and hearing.

[9]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[10]  Yang Lu,et al.  An algorithm that improves speech intelligibility in noise for normal-hearing listeners. , 2009, The Journal of the Acoustical Society of America.

[11]  C M Reed,et al.  A comparison of the effects of filtering and sensorineural hearing loss on patients of consonant confusions. , 1978, Journal of speech and hearing research.

[12]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[13]  D. Markle,et al.  Hearing Aids , 1936, The Journal of Laryngology & Otology.

[14]  DeLiang Wang,et al.  An algorithm to improve speech recognition in noise for hearing-impaired listeners. , 2013, The Journal of the Acoustical Society of America.

[15]  R. M. Warren,et al.  Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits , 1995, Perception & psychophysics.

[16]  DeLiang Wang,et al.  On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.

[17]  DeLiang Wang,et al.  Speech intelligibility in background noise with ideal binary time-frequency masking. , 2009, The Journal of the Acoustical Society of America.

[18]  DeLiang Wang,et al.  Towards Scaling Up Classification-Based Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Jeff A. Bilmes,et al.  MVA Processing of Speech Features , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[21]  E. Healy,et al.  Effects of ipsilateral and contralateral precursors on the temporal effect in simultaneous masking with pure tones. , 2000, The Journal of the Acoustical Society of America.

[22]  M. D. Wang,et al.  Consonant confusions in noise: a study of perceptual features. , 1973, The Journal of the Acoustical Society of America.

[23]  DeLiang Wang,et al.  Exploring Monaural Features for Classification-Based Speech Segregation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[24]  Q J Fu,et al.  Effects of noise and spectral resolution on vowel and consonant recognition: acoustic and electric hearing. , 1998, The Journal of the Acoustical Society of America.

[25]  Xihong Wu,et al.  Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise. , 2011, The Journal of the Acoustical Society of America.

[26]  Deliang Wang,et al.  Role of mask pattern in intelligibility of ideal binary-masked noisy speech. , 2009, The Journal of the Acoustical Society of America.

[27]  S Gordon-Salant Consonant recognition and confusion patterns among elderly hearing-impaired subjects. , 1987, Ear and hearing.

[28]  WangDeLiang,et al.  Towards Scaling Up Classification-Based Speech Separation , 2013 .

[29]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[30]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[31]  P F Assmann,et al.  Acoustic and linguistic factors in the perception of bandpass-filtered speech. , 2001, The Journal of the Acoustical Society of America.

[32]  D. Wang,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2008, IEEE Trans. Neural Networks.

[33]  DeLiang Wang,et al.  Speech perception of noise with binary gains. , 2008, The Journal of the Acoustical Society of America.

[34]  DeLiang Wang,et al.  Speech segregation based on pitch tracking and amplitude modulation , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[35]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[36]  Eric W Healy,et al.  Relative contribution of target and masker temporal fine structure to the unmasking of consonants in noise. , 2011, The Journal of the Acoustical Society of America.

[37]  H. Dillon,et al.  The National Acoustic Laboratories' (NAL) New Procedure for Selecting the Gain and Frequency Response of a Hearing Aid , 1986, Ear and hearing.