Relative contributions of spectral and temporal cues for phoneme recognition.

Cochlear implants provide users with limited spectral and temporal information. In this study, the amount of spectral and temporal information was systematically varied through simulations of cochlear implant processors using a noise-excited vocoder. Spectral information was controlled by varying the number of channels between 1 and 16, and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors from 1 to 512 Hz. Consonants and vowels processed using those conditions were presented to seven normal-hearing native-English-speaking listeners for identification. The results demonstrated that both spectral and temporal cues were important for consonant and vowel recognition with the spectral cues having a greater effect than the temporal cues for the ranges of numbers of channels and lowpass cutoff frequencies tested. The lowpass cutoff for asymptotic performance in consonant and vowel recognition was 16 and 4 Hz, respectively. The number of channels at which performance plateaued for consonants and vowels was 8 and 12, respectively. Within the above-mentioned ranges of lowpass cutoff frequency and number of channels, the temporal and spectral cues showed a tradeoff for phoneme recognition. Information transfer analyses showed different relative contributions of spectral and temporal cues in the perception of various phonetic/acoustic features.

[1]  A Boothroyd,et al.  Speech recognition with reduced spectral cues as a function of age. , 2000, The Journal of the Acoustical Society of America.

[2]  P C Loizou,et al.  On the number of channels needed to understand speech. , 1999, The Journal of the Acoustical Society of America.

[3]  F. Zeng,et al.  Speech recognition with altered spectral distribution of envelope cues. , 1996, The Journal of the Acoustical Society of America.

[4]  M F Dorman,et al.  Recognition of Monosyllabic Words by Cochlear Implant Patients and by Normal-Hearing Subjects Listening to Words Processed through Cochlear Implant Signal Processing Strategies , 2000, The Annals of otology, rhinology & laryngology. Supplement.

[5]  R V Shannon,et al.  Consonant recordings for speech testing. , 1999, The Journal of the Acoustical Society of America.

[6]  R V Shannon,et al.  Speech recognition as a function of the number of electrodes used in the SPEAK cochlear implant speech processor. , 1997, Journal of speech, language, and hearing research : JSLHR.

[7]  Taehong Cho,et al.  Acoustic and aerodynamic correlates of Korean stops and fricatives , 2002, J. Phonetics.

[8]  K. Stevens Acoustic correlates of some phonetic categories. , 1979, The Journal of the Acoustical Society of America.

[9]  R. Shannon,et al.  Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. , 2001, The Journal of the Acoustical Society of America.

[10]  P C Loizou,et al.  Speech recognition by normal-hearing and cochlear implant listeners as a function of intensity resolution. , 2000, The Journal of the Acoustical Society of America.

[11]  M W Skinner,et al.  Identification of Speech by Cochlear Implant Recipients with the Multipeak (MPEAK) and Spectral Peak (SPEAK) Speech Coding Strategies I. Vowels , 1996, Ear and hearing.

[12]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[13]  E. M. Burns,et al.  Played-again SAM: Further observations on the pitch of amplitude-modulated noise , 1981 .

[14]  Q J Fu,et al.  Effects of noise and spectral resolution on vowel and consonant recognition: acoustic and electric hearing. , 1998, The Journal of the Acoustical Society of America.

[15]  S. Rosen Temporal information in speech: acoustic, auditory and linguistic aspects. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[16]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[17]  Qian-Jie Fu,et al.  The number of spectral channels required for speech recognition depends on the difficulty of the listening situation. , 2004, Acta oto-laryngologica. Supplementum.

[18]  D T Lawson,et al.  Temporal representations with cochlear implants. , 1997, The American journal of otology.

[19]  S. Blumstein,et al.  Acoustic invariance in speech production: evidence from measurements of the spectral characteristics of stop consonants. , 1979, The Journal of the Acoustical Society of America.

[20]  Fan-Gang Zeng,et al.  Temporal pitch in electric hearing , 2002, Hearing Research.

[21]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[22]  M. Demorest,et al.  Speech recognition at simulated soft, conversational, and raised-to-loud vocal efforts by adults with cochlear implants. , 1997, The Journal of the Acoustical Society of America.

[23]  Jean‐Pierre A. Radley,et al.  Acoustic Properties of Stop Consonants , 1957 .

[24]  B J Gantz,et al.  Performance over time with a nucleus or Ineraid cochlear implant. , 1992, Ear and hearing.

[25]  D J Van Tasell,et al.  Speech waveform envelope cues for consonant recognition. , 1987, The Journal of the Acoustical Society of America.

[26]  San Duanmu,et al.  “Tense” and “Lax” Stops in Korean , 2004 .

[27]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[28]  B. Moore,et al.  Effects of spectral smearing on the intelligibility of sentences in noise , 1993 .

[29]  N. Viemeister Temporal modulation transfer functions based upon modulation thresholds. , 1979, The Journal of the Acoustical Society of America.

[30]  Mieko S. Han,et al.  Acoustic Features of Korean /P, T, K/, /p, t, k/ and /ph, th, kh/ , 1970 .

[31]  Fan-Gang Zeng,et al.  Music Perception with Temporal Cues in Acoustic and Electric Hearing , 2004, Ear and hearing.

[32]  F B Simmons,et al.  Electrical stimulation of the auditory nerve in man. , 1966, Archives of otolaryngology.

[33]  T A Ricketts,et al.  The effects of compression ratio, signal-to-noise ratio, and level on speech recognition in normal-hearing listeners. , 2001, The Journal of the Acoustical Society of America.

[34]  M. Dorman,et al.  Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. , 1997, The Journal of the Acoustical Society of America.

[35]  F. J. Hill,et al.  Speech recognition as a function of channel capacity in a discrete set of channels. , 1968, The Journal of the Acoustical Society of America.

[36]  R. Shannon Multichannel electrical stimulation of the auditory nerve in man. I. Basic psychophysics , 1983, Hearing Research.

[37]  Robert S Hong,et al.  Signal Coding in Cochlear Implants: Exploiting Stochastic Effects of Electrical Stimulation , 2003, The Annals of otology, rhinology & laryngology. Supplement.

[38]  D. D. Greenwood A cochlear frequency-position function for several species--29 years later. , 1990, The Journal of the Acoustical Society of America.

[39]  R. Plomp,et al.  Effect of spectral envelope smearing on speech reception. I. , 1991, The Journal of the Acoustical Society of America.

[40]  A Kohlrausch,et al.  Intrinsic envelope fluctuations and modulation-detection thresholds for narrow-band noise carriers. , 1999, The Journal of the Acoustical Society of America.

[41]  Ying-Yee Kong,et al.  Temporal and spectral cues in Mandarin tone recognition , 2004 .

[42]  R S Tyler,et al.  The recognition of vowels differing by a single formant by cochlear-implant subjects. , 1989, The Journal of the Acoustical Society of America.

[43]  R. Plomp,et al.  Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[44]  E Villchur Electronic models to simulate the effect of sensory distortions on speech perception by the deaf. , 1977, The Journal of the Acoustical Society of America.

[45]  M W Skinner,et al.  Identification of speech by cochlear implant recipients with the multipeak (MPEAK) and spectral peak (SPEAK) speech coding strategies II. Consonants. , 1996, Ear and hearing.

[46]  R. Plomp,et al.  Effect of spectral envelope smearing on speech reception. II. , 1992, The Journal of the Acoustical Society of America.

[47]  M. D. Wang,et al.  Consonant confusions in noise: a study of perceptual features. , 1973, The Journal of the Acoustical Society of America.

[48]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[49]  R V Shannon,et al.  Effects of electrode location and spacing on phoneme recognition with the Nucleus-22 cochlear implant. , 1999, Ear and hearing.

[50]  Tom Tremain Analysis and synthesis of speech , 1995 .

[51]  N. Viemeister,et al.  Temporal modulation transfer functions in normal-hearing and hearing-impaired listeners. , 1985, Audiology : official organ of the International Society of Audiology.

[52]  M F Dorman,et al.  The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6-20 channels. , 1998, The Journal of the Acoustical Society of America.

[53]  P. Belafsky,et al.  Topical Nasal Anesthesia and Laryngopharyngeal Sensory Testing: A Prospective, Double-Blind Crossover Study , 2003, The Annals of otology, rhinology, and laryngology.

[54]  B C Moore,et al.  Effects of spectral smearing on the intelligibility of sentences in the presence of interfering speech. , 1994, The Journal of the Acoustical Society of America.

[55]  A van Wieringen,et al.  Natural vowel and consonant recognition by Laura cochlear implantees. , 1999, Ear and hearing.

[56]  R. Shannon,et al.  Effect of stimulation rate on phoneme recognition by nucleus-22 cochlear implant listeners. , 2000, The Journal of the Acoustical Society of America.

[57]  Manfred R. Schroeder,et al.  Vocoders: Analysis and synthesis of speech , 1966 .

[58]  R S Tyler,et al.  Consonant recognition by some of the better cochlear-implant patients. , 1992, The Journal of the Acoustical Society of America.

[59]  Belinda A Henry,et al.  The resolution of complex spectral patterns by cochlear implant and normal-hearing listeners. , 2003, The Journal of the Acoustical Society of America.

[60]  R. Shannon Temporal modulation transfer functions in patients with cochlear implants. , 1992, The Journal of the Acoustical Society of America.

[61]  Bruce J Gantz,et al.  Speech recognition in noise for cochlear implant listeners: benefits of residual acoustic hearing. , 2004, The Journal of the Acoustical Society of America.

[63]  Bryan E Pfingst,et al.  Features of stimulation affecting tonal-speech perception: implications for cochlear prostheses. , 2002, The Journal of the Acoustical Society of America.

[64]  Margaret W Skinner,et al.  Nucleus® 24 Advanced Encoder Conversion Study: Performance versus Preference , 2002, Ear and hearing.

[65]  A. Boothroyd,et al.  Effects of spectral smearing on phoneme and word recognition. , 1996, The Journal of the Acoustical Society of America.