Matching pursuit for channel selection in cochlear implants based on an intelligibility metric

In this paper the earlier proposed short-time objective intelligibility predictor (STOI) is simplified such that it can be expressed as a weighted ℓ2 norm in the auditory domain. Due to the mathematical properties of a norm, STOI can now be used with the matching pursuit algorithm in the n-of-m channel selection technique as found in several cochlear implant (CI) coding strategies. With this technique only a subset of frequency channels (electrodes) are stimulated, such that important channels can be updated more frequently and less significant channels are omitted. Intelligibility predictions with acoustic CI-simulations for normal-hearing listeners indicate that more intelligible speech is obtained with the proposed method compared to a conventional channel selection method based on peak picking. Reasons for this difference in performance are: (1) STOI considers an analysis window of a few hundreds of milliseconds in order to account for important low temporal modulations for speech intelligibility and (2) spectral leakage per channel is accounted for in the mathematical optimization process.

[1]  Torsten Dau,et al.  Prediction of speech intelligibility based on an auditory preprocessing model , 2010, Speech Commun..

[2]  Thomas Lenarz,et al.  A Psychoacoustic "NofM"-Type Speech Coding Strategy for Cochlear Implants , 2005, EURASIP J. Adv. Signal Process..

[3]  Michael F Dorman,et al.  A comparison of the speech understanding provided by acoustic models of fixed-channel and channel-picking signal processors for cochlear implants. , 2002, Journal of speech, language, and hearing research : JSLHR.

[4]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[6]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[7]  P Seligman,et al.  Architecture of the Spectra 22 speech processor. , 1995, The Annals of otology, rhinology & laryngology. Supplement.

[8]  Jesper Jensen,et al.  An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech. , 2011, The Journal of the Acoustical Society of America.

[9]  Fei Chen,et al.  Predicting the Intelligibility of Vocoded Speech , 2011, Ear and hearing.

[10]  Jesper Jensen,et al.  A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration , 2005, EURASIP J. Adv. Signal Process..

[11]  Blake S. Wilson,et al.  Cochlear implants: A remarkable past and a brilliant future , 2008, Hearing Research.

[12]  Günther Palm,et al.  Effects of phase on the perception of intervocalic stop consonants , 1997, Speech Commun..