Interaction between speech variations and background noise on speech intelligibility by Mandarin-speaking cochlear implant patients

Abstract Cochlear implant (CI) users have been shown to be more susceptible to the variations in speech production encountered in everyday listening, in which speaking rate, amplitude, duration, and voice pitch information may be quite variable, depending on the production context. Such variations may be further enlarged by the background noise, especially dynamic noise. The limited spectral resolution provided by the CI limits perception of voice pitch, which is an important cue for speech prosody and for tonal languages such as Mandarin Chinese. In this study, the effect of varying speaking rates and styles and background noise on speech understanding was investigated in Mandarin-speaking CI and normal-hearing (NH) listeners. Thirteen (5 male and 8 female, age 19–62 years) Mandarin-speaking, post-lingually deafened adult CI patients using their clinical processors and 9 (5 male and 4 female, age 23–59 years) NH subjects listening to unprocessed speech. Five different types of speech variations, including 3 speaking rates (slow, normal, fast) and 2 speaking styles (emotional, shouted) were presented with two masking noises (speech-shaped steady state noise-SSN or six-talker babble). Speech reception threshold, defined as the signal-to-noise ratio producing 50% correct word-in-sentence recognition using Mandarin Speech Perception materials was measured. NH listeners performed significantly better (16.7 dB) than CI patients across all conditions regardless of speech variations and noise types. CI patients’ performance deficit was highly dependent on speech rate and noise type; the deficit was smallest (11.7 dB) when slowly-spoken speech was presented in SSN and largest (20.6 dB) when shouted speech was presented in six-talker speech babble. NH listeners performed significantly better in speech babble than in SSN for all speech variations, while CI patients performed similarly in both noise types. The use of clear and slowly-spoken speech in the laboratory setting may largely underestimate CI patients’ performance deficits in real-world listening conditions, where acoustic variations introduced by speech variations and dynamic noise may present additional challenges.

[1]  J L Miller,et al.  Some effects of speaking rate on the production of /b/ and /w/. , 1983, The Journal of the Acoustical Society of America.

[2]  L. Braida,et al.  Speaking clearly for the hard of hearing IV: Further studies of the role of speaking rate. , 1996, Journal of speech and hearing research.

[3]  Sheng Liu,et al.  Clear speech perception in acoustic and electric hearing. , 2004, The Journal of the Acoustical Society of America.

[4]  Qian-Jie Fu,et al.  Development and validation of the Mandarin speech perception test. , 2011, The Journal of the Acoustical Society of America.

[5]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[6]  Qian-Jie Fu,et al.  Speech perception with music maskers by cochlear implant users and normal-hearing listeners. , 2012, Journal of speech, language, and hearing research : JSLHR.

[7]  J. Sawusch,et al.  Perceptual normalization for speaking rate: Effects of temporal distance , 1996, Perception & psychophysics.

[8]  Xin Luo,et al.  Cochlear Implants Special Issue Article: Vocal Emotion Recognition by Normal-Hearing Listeners and Cochlear Implant Users , 2007, Trends in amplification.

[9]  Qian-Jie Fu,et al.  Intelligibility of naturally produced and synthesized Mandarin speech by cochlear implant listeners. , 2018, The Journal of the Acoustical Society of America.

[10]  Qian-Jie Fu,et al.  Effects of speaking style on speech intelligibility for Mandarin-speaking cochlear implant users. , 2011, The Journal of the Acoustical Society of America.

[11]  N. Lass The Significance of Intra- and Intersentence Pause Times in Perceptual Judgments of Oral Reading Rate , 1970 .

[12]  Zhigang Deng,et al.  An acoustic study of emotions expressed in speech , 2004, INTERSPEECH.

[13]  Michael F Dorman,et al.  Development and Validation of the AzBio Sentence Lists , 2012, Ear and hearing.

[14]  Jean C. Krause,et al.  Investigating alternative forms of clear speech: the effects of speaking rate and speaking mode on intelligibility. , 2002, The Journal of the Acoustical Society of America.

[15]  T H Crystal,et al.  A note on the variability of timing control. , 1988, Journal of speech and hearing research.

[16]  Maxine Eskénazi,et al.  Trends in speaking styles research , 1993, EUROSPEECH.

[17]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[18]  L. Lisker Closure Duration and the Intervocalic Voiced-Voiceless Distinction in English , 1957 .

[19]  N I Durlach,et al.  Speaking clearly for the hard of hearing. III: An attempt to determine the contribution of speaking rate to differences in intelligibility between clear and conversational speech. , 1989, Journal of speech and hearing research.

[20]  K. Kohler,et al.  Parameters of Speech Rate Perception in German Words and Sentences: Duration, F o Movement, and F o Level , 1986, Language and speech.

[21]  J. M. Pickett,et al.  Effects of Vocal Force on the Intelligibility of Speech Sounds , 1956 .