Vowel recognition via cochlear implants and noise vocoders: effects of formant movement and duration.

Previous work has demonstrated that normal-hearing individuals use fine-grained phonetic variation, such as formant movement and duration, when recognizing English vowels. The present study investigated whether these cues are used by adult postlingually deafened cochlear implant users, and normal-hearing individuals listening to noise-vocoder simulations of cochlear implant processing. In Experiment 1, subjects gave forced-choice identification judgments for recordings of vowels that were signal processed to remove formant movement and/or equate vowel duration. In Experiment 2, a goodness-optimization procedure was used to create perceptual vowel space maps (i.e., best exemplars within a vowel quadrilateral) that included F1, F2, formant movement, and duration. The results demonstrated that both cochlear implant users and normal-hearing individuals use formant movement and duration cues when recognizing English vowels. Moreover, both listener groups used these cues to the same extent, suggesting that postlingually deafened cochlear implant users have category representations for vowels that are similar to those of normal-hearing individuals.

[1]  R V Shannon,et al.  Speech recognition as a function of the number of electrodes used in the SPEAK cochlear implant speech processor. , 1997, Journal of speech, language, and hearing research : JSLHR.

[2]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[3]  Richard Wright,et al.  The Hyperspace Effect: Phonetic Targets Are Hyperarticulated. , 1993 .

[4]  B J Gantz,et al.  Performance over time of adult patients using the Ineraid or nucleus cochlear implant. , 1997, The Journal of the Acoustical Society of America.

[5]  Sarah Hawkins,et al.  polysp: a polysystemic, phonetically-rich approach to speech understanding , 2001 .

[6]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[7]  Matthias J. Sjerps,et al.  Speaker Normalization in Speech Perception , 2008, The Handbook of Speech Perception.

[8]  D. Kewley-Port,et al.  Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. , 2002, The Journal of the Acoustical Society of America.

[9]  Hideki Kawahara,et al.  Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[11]  Peter F Assmann,et al.  Synthesis fidelity and time-varying spectral change in vowels. , 2005, The Journal of the Acoustical Society of America.

[12]  R. Shannon,et al.  Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. , 2001, The Journal of the Acoustical Society of America.

[13]  M F Dorman,et al.  The Identification of Consonants and Vowels by Cochlear Implant Patients Using a 6‐Channel Continuous Interleaved Sampling Processor and by Normal‐Hearing Subjects Using Simulations of Processors with Two to Nine Channels , 1998, Ear and hearing.

[14]  M A Svirsky,et al.  Long-term auditory adaptation to a modified peripheral frequency map , 2004, Acta oto-laryngologica.

[15]  D. Pisoni,et al.  Talker-specific learning in speech perception , 1998, Perception & psychophysics.

[16]  G M Clark,et al.  The perception of temporal modulations by cochlear implant patients. , 1993, The Journal of the Acoustical Society of America.

[17]  J L Miller,et al.  Contextual influences on the internal structure of phonetic categories: a distinction between lexical status and speaking rate. , 1999, Perception & psychophysics.

[18]  Bryan E Pfingst,et al.  Relative contributions of spectral and temporal cues for phoneme recognition. , 2005, The Journal of the Acoustical Society of America.

[19]  T. M. Nearey,et al.  Identification of resynthesized /hVd/ utterances: effects of formant contour. , 1999, The Journal of the Acoustical Society of America.

[20]  R V Shannon,et al.  Detection of gaps in sinusoids and pulse trains by patients with cochlear implants. , 1989, The Journal of the Acoustical Society of America.

[21]  A. R. Kaiser,et al.  Perceptual "vowel spaces" of cochlear implant users: implications for the study of auditory adaptation to spectral shift. , 2001, The Journal of the Acoustical Society of America.

[22]  J. Hillenbrand,et al.  Some effects of duration on vowel recognition. , 2000, The Journal of the Acoustical Society of America.

[23]  R. Shannon Temporal modulation transfer functions in patients with cochlear implants. , 1992, The Journal of the Acoustical Society of America.

[24]  G. Studebaker A "rationalized" arcsine transform. , 1985, Journal of speech and hearing research.

[25]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[26]  T. Välimaa,et al.  Phoneme recognition and confusions with multichannel cochlear implants: vowels. , 2002, Journal of speech, language, and hearing research : JSLHR.

[27]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[28]  A. Faulkner,et al.  Adaptation by normal listeners to upward spectral shifts of speech: implications for cochlear implants. , 1999, The Journal of the Acoustical Society of America.

[29]  R. Hurtig,et al.  The use of static and dynamic vowel cues by multichannel cochlear implant users. , 1992, The Journal of the Acoustical Society of America.

[30]  M. Dorman,et al.  Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. , 1997, The Journal of the Acoustical Society of America.

[31]  D. D. Greenwood A cochlear frequency-position function for several species--29 years later. , 1990, The Journal of the Acoustical Society of America.

[32]  T. Välimaa,et al.  Phoneme recognition and confusions with multichannel cochlear implants: consonants. , 2002, Journal of speech, language, and hearing research : JSLHR.

[33]  W. Strange,et al.  Dynamic specification of coarticulated vowels spoken in sentence context. , 1989, The Journal of the Acoustical Society of America.

[34]  H. Hoffman Study of Some Cues in the Perception of the Voiced Stop Consonants , 1958 .

[35]  Qian-Jie Fu,et al.  Auditory Training with Spectrally Shifted Speech: Implications for Cochlear Implant Patient Auditory Rehabilitation , 2005, Journal of the Association for Research in Otolaryngology.

[36]  P. Iverson,et al.  Vowel normalization for accent: an investigation of best exemplar locations in northern and southern British English sentences. , 2004, The Journal of the Acoustical Society of America.

[37]  P. Kuhl,et al.  Influences of phonetic identification and category goodness on American listeners' perception of /r/ and /l/. , 1996, The Journal of the Acoustical Society of America.

[38]  M F Dorman,et al.  Mechanisms of vowel recognition for Ineraid patients fit with continuous interleaved sampling processors. , 1997, The Journal of the Acoustical Society of America.