Contribution of frequency modulation to speech recognition in noise.

Cochlear implants allow most patients with profound deafness to successfully communicate under optimal listening conditions. However, the amplitude modulation (AM) information provided by most implants is not sufficient for speech recognition in realistic settings where noise is typically present. This study added slowly varying frequency modulation (FM) to the existing algorithm of an implant simulation and used competing sentences to evaluate FM contributions to speech recognition in noise. Potential FM advantage was evaluated as a function of the number of spectral bands, FM depth, FM rate, and FM band distribution. Barring floor and ceiling effects, significant improvement was observed for all bands from 1 to 32 with the additional FM cue both in quiet and noise. Performance also improved with greater FM depth and rate, which might reflect resolved sidebands under the FM condition. Having FM present in low-frequency bands was more beneficial than in high-frequency bands, and only half of the bands required the presence of FM, regardless of position, to achieve performance similar to when all bands had the FM cue. These results provide insight into the relative contributions of AM and FM to speech communication and the potential advantage of incorporating FM for cochlear implant signal processing.

[1]  Bruce J Gantz,et al.  Speech recognition in noise for cochlear implant listeners: benefits of residual acoustic hearing. , 2004, The Journal of the Acoustical Society of America.

[2]  Fan-Gang Zeng,et al.  Cochlear implant speech recognition with speech maskers. , 2004, The Journal of the Acoustical Society of America.

[3]  Fan-Gang Zeng,et al.  Speech and melody recognition in binaurally combined acoustic and electric hearing. , 2005, The Journal of the Acoustical Society of America.

[4]  P F Assmann The role of formant transitions in the perception of concurrent vowels. , 1995, The Journal of the Acoustical Society of America.

[5]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[6]  Fan-Gang Zeng,et al.  Speech recognition with amplitude and frequency modulations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Zachary M. Smith,et al.  Chimaeric sounds reveal dichotomies in auditory perception , 2002, Nature.

[8]  Stuart Rosen,et al.  Enhancing temporal cues to voice pitch in continuous interleaved sampling cochlear implants. , 2004, The Journal of the Acoustical Society of America.

[9]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[10]  Shangkai Gao,et al.  A novel speech-processing strategy incorporating tonal information for cochlear implants , 2004, IEEE Transactions on Biomedical Engineering.

[11]  Michael K. Qin,et al.  Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers. , 2003, The Journal of the Acoustical Society of America.

[12]  Peggy B Nelson,et al.  Understanding speech in modulated interference: cochlear implant users and normal-hearing listeners. , 2003, The Journal of the Acoustical Society of America.

[13]  R. W. Hukin,et al.  Effectiveness of spatial cues, prosody, and talker characteristics in selective attention. , 2000, The Journal of the Acoustical Society of America.

[14]  G. Stickney,et al.  On the dichotomy in auditory perception between temporal envelope and fine structure cues. , 2004, The Journal of the Acoustical Society of America.

[15]  Fan-Gang Zeng,et al.  Encoding frequency Modulation to improve cochlear implant performance in noise , 2005, IEEE Transactions on Biomedical Engineering.

[16]  Fan-Gang Zeng,et al.  Frequency modulation detection in cochlear implant subjects. , 2004, The Journal of the Acoustical Society of America.

[17]  M F Dorman,et al.  The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6-20 channels. , 1998, The Journal of the Acoustical Society of America.

[18]  R. Shannon,et al.  Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. , 2001, The Journal of the Acoustical Society of America.

[19]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.