The potential of onset enhancement for increased speech intelligibility in auditory prostheses.

Recent studies have shown that transient parts of a speech signal contribute most to speech intelligibility in normal-hearing listeners. In this study, the influence of enhancing the onsets of the envelope of the speech signal on speech intelligibility in noisy conditions using an eight channel cochlear implant vocoder simulation was investigated. The enhanced envelope (EE) strategy emphasizes the onsets of the speech envelope by deriving an additional peak signal at the onsets in each frequency band. A sentence recognition task in stationary speech shaped noise showed a significant speech reception threshold (SRT) improvement of 2.5 dB for the EE in comparison to the reference continuous interleaved sampling strategy and of 1.7 dB when an ideal Wiener filter was used for the onset extraction on the noisy signal. In a competitive talker condition, a significant SRT improvement of 2.6 dB was measured. A benefit was obtained in all experiments with the peak signal derived from the clean speech. Although the EE strategy is not effective in many real-life situations, the results suggest that there is potential for speech intelligibility improvement when an enhancement of the onsets of the speech envelope is included in the signal processing of auditory prostheses.

[1]  Ching-Chung Li,et al.  Enhancement of speech intelligibility using transients extracted by wavelet packets , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[2]  Matthew H. Davis,et al.  Lexical information drives perceptual learning of distorted speech: evidence from the comprehension of noise-vocoded sentences. , 2005, Journal of experimental psychology. General.

[3]  Jan Wouters,et al.  APEX 3: a multi-purpose test platform for auditory psychophysical experiments , 2008, Journal of Neuroscience Methods.

[4]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[5]  DeLiang Wang,et al.  Auditory Segmentation Based on Onset and Offset Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Jacob Benesty,et al.  Speech Enhancement , 2010 .

[7]  J Wouters,et al.  Enhancing the speech envelope of continuous interleaved sampling processors for cochlear implants. , 1999, The Journal of the Acoustical Society of America.

[8]  Jae Hee Lee,et al.  Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. , 2007, The Journal of the Acoustical Society of America.

[9]  Andrew C. Simpson,et al.  The effect of cue-enhancement on the intelligibility of nonsense word and sentence materials presented in noise , 1998, Speech Commun..

[10]  Margaret W Skinner,et al.  Speech recognition with the advanced combination encoder and transient emphasis spectral maxima strategies in nucleus 24 recipients. , 2005, Journal of speech, language, and hearing research : JSLHR.

[11]  R Plomp,et al.  The negative effect of amplitude compression in multichannel hearing aids in the light of the modulation-transfer function. , 1988, The Journal of the Acoustical Society of America.

[12]  F A Wichmann,et al.  Ning for Helpful Comments and Suggestions. This Paper Benefited Con- Siderably from Conscientious Peer Review, and We Thank Our Reviewers the Psychometric Function: I. Fitting, Sampling, and Goodness of Fit , 2001 .

[13]  Jacob Benesty,et al.  New insights into the noise reduction Wiener filter , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Fei Chen,et al.  Predicting the Intelligibility of Vocoded Speech , 2011, Ear and hearing.

[15]  S F Bahgat,et al.  Envelope expansion methods for speech enhancement. , 1991, The Journal of the Acoustical Society of America.

[16]  Rainer Martin,et al.  MAP Estimators for Speech Enhancement Under Normal and Rayleigh Inverse Gaussian Distributions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Frédéric Berthommier,et al.  Effects of envelope expansion on speech recognition , 1999, Hearing Research.

[18]  A. Bregman Auditory Scene Analysis , 2008 .

[19]  Jacob Benesty,et al.  Study of the Wiener Filter for Noise Reduction , 2005 .

[20]  G. Studebaker A "rationalized" arcsine transform. , 1985, Journal of speech and hearing research.

[21]  Peter Vary,et al.  Digital Speech Transmission: Enhancement, Coding and Error Concealment , 2006 .

[22]  Felix Wichmann,et al.  The psychometric function: II. Bootstrap-based confidence intervals and sampling , 2001, Perception & psychophysics.

[23]  Yi Hu,et al.  Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..

[24]  Christian Lorenzi,et al.  Identification of envelope-expanded sentences in normal-hearing and hearing-impaired listeners , 2004, Hearing Research.

[25]  S. Shamma,et al.  Temporal coherence and attention in auditory scene analysis , 2011, Trends in Neurosciences.

[26]  B. Moore,et al.  Tolerable Hearing-Aid Delays: IV. Effects on Subjective Disturbance During Speech Production by Hearing-Impaired Subjects , 2005, Ear and hearing.

[27]  Fei Chen,et al.  Contribution of Consonant Landmarks to Speech Recognition in Simulated Acoustic-Electric Hearing , 2010, Ear and hearing.

[28]  Christian E Stilp,et al.  Cochlea-scaled entropy, not consonants, vowels, or time, best predicts speech intelligibility , 2010, Proceedings of the National Academy of Sciences.

[29]  John G. Harris,et al.  Applied principles of clear and Lombard speech for automated intelligibility enhancement in noisy environments , 2006, Speech Commun..

[30]  Michael Kiefte,et al.  Sensitivity to change in perception of speech , 2003, Speech Commun..

[31]  Jesper Jensen,et al.  Noise Tracking Using DFT Domain Subspace Decompositions , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[32]  A E Vandali,et al.  Emphasis of short-duration acoustic speech cues for cochlear implant users. , 2001, The Journal of the Acoustical Society of America.

[33]  Raymond L. Goldsworthy,et al.  Analysis of speech-based Speech Transmission Index methods with implications for nonlinear operations. , 2004, The Journal of the Acoustical Society of America.

[34]  Rainer Martin,et al.  Single and Dual Channel Noise Reduction , 2006 .

[35]  Jae Hee Lee,et al.  Intelligibility of interrupted sentences at subsegmental levels in young normal-hearing and elderly hearing-impaired listeners. , 2009, The Journal of the Acoustical Society of America.

[36]  Astrid van Wieringen,et al.  LIST and LINT: Sentences and numbers for quantifying speech understanding in severely impaired listeners for Flanders and the Netherlands , 2008, International journal of audiology.

[37]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[38]  Rahul Sarpeshkar,et al.  Evaluation of companding-based spectral enhancement using simulated cochlear-implant processing. , 2007, The Journal of the Acoustical Society of America.

[39]  Giso Grimm,et al.  Multicenter evaluation of signal enhancement algorithms for hearing aids. , 2010, The Journal of the Acoustical Society of America.

[40]  H Levitt,et al.  Consonant-vowel intensity ratios for maximizing consonant recognition by hearing-impaired listeners. , 1998, The Journal of the Acoustical Society of America.

[41]  Yi Hu,et al.  A comparative intelligibility study of single-microphone noise reduction algorithms. , 2007, The Journal of the Acoustical Society of America.

[42]  Daniel Fogerty,et al.  Perceptual contributions of the consonant-vowel boundary to sentence intelligibility. , 2009, The Journal of the Acoustical Society of America.

[43]  Yi Hu,et al.  Environment-specific noise suppression for improved speech intelligibility by cochlear implant users. , 2010, The Journal of the Acoustical Society of America.

[44]  T. Langhans,et al.  Speech enhancement by nonlinear multiband envelope filtering , 1982, ICASSP.

[45]  Michael J Owren,et al.  The relative roles of vowels and consonants in discriminating talker identity versus word meaning. , 2006, The Journal of the Acoustical Society of America.

[46]  J Bamford,et al.  The BKB (Bamford-Kowal-Bench) sentence lists for partially-hearing children. , 1979, British journal of audiology.

[47]  Fan-Gang Zeng,et al.  Combined spectral and temporal enhancement to improve cochlear-implant speech perception. , 2011, The Journal of the Acoustical Society of America.

[48]  T Houtgast,et al.  Compression and expansion of the temporal envelope: evaluation of speech intelligibility and sound quality. , 1999, The Journal of the Acoustical Society of America.

[49]  R. Shannon,et al.  Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. , 2001, The Journal of the Acoustical Society of America.

[50]  Michael S. Lewicki,et al.  Information theory: A signal take on speech , 2010, Nature.

[51]  Fei Chen,et al.  Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise. , 2012, The Journal of the Acoustical Society of America.

[52]  Ching-Chung Li,et al.  Speech signal modification to increase intelligibility in noisy environments. , 2007, The Journal of the Acoustical Society of America.

[53]  B. Delgutte,et al.  Speech coding in the auditory nerve: IV. Sounds with consonant-like dynamic characteristics. , 1984, The Journal of the Acoustical Society of America.

[54]  Peggy B Nelson,et al.  Understanding speech in modulated interference: cochlear implant users and normal-hearing listeners. , 2003, The Journal of the Acoustical Society of America.

[55]  T Houtgast,et al.  Method for the selection of sentence materials for efficient measurement of the speech reception threshold. , 1999, The Journal of the Acoustical Society of America.

[56]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[57]  J. Jenkins,et al.  Dynamic specification of coarticulated vowels. , 1983, The Journal of the Acoustical Society of America.

[58]  Robert V. Shannon,et al.  Recognition of spectrally degraded speech in noise with nonlinear amplitude mapping , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[59]  M F Dorman,et al.  The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6-20 channels. , 1998, The Journal of the Acoustical Society of America.