On the balance of envelope and temporal fine structure in the encoding of speech in the early auditory system.

There is much debate on how the spectrotemporal modulations of speech (or its spectrogram) are encoded in the responses of the auditory nerve, and whether speech intelligibility is best conveyed via the "envelope" (E) or "temporal fine-structure" (TFS) of the neural responses. Wide use of vocoders to resolve this question has commonly assumed that manipulating the amplitude-modulation and frequency-modulation components of the vocoded signal alters the relative importance of E or TFS encoding on the nerve, thus facilitating assessment of their relative importance to intelligibility. Here we argue that this assumption is incorrect, and that the vocoder approach is ineffective in differentially altering the neural E and TFS. In fact, we demonstrate using a simplified model of early auditory processing that both neural E and TFS encode the speech spectrogram with constant and comparable relative effectiveness regardless of the vocoder manipulations. However, we also show that neural TFS cues are less vulnerable than their E counterparts under severe noisy conditions, and hence should play a more prominent role in cochlear stimulation strategies.

[1]  Brian C J Moore,et al.  Speech perception problems of the hearing impaired reflect inability to use temporal fine structure , 2006, Proceedings of the National Academy of Sciences.

[2]  Blake S Wilson,et al.  Cochlear implants: current designs and future possibilities. , 2008, Journal of rehabilitation research and development.

[3]  Brian C J Moore,et al.  The Choice of Compression Speed in Hearing Aids: Theoretical and Practical Considerations and the Role of Individual Differences , 2008, Trends in amplification.

[4]  Jayaganesh Swaminathan,et al.  Quantifying Envelope and Fine-Structure Coding in Auditory Nerve Responses to Chimaeric Speech , 2009, Journal of the Association for Research in Otolaryngology.

[5]  C. Lorenzi,et al.  Effects of lowpass and highpass filtering on the intelligibility of speech based on temporal fine structure or envelope cues , 2010, Hearing Research.

[6]  Robert V. Shannon,et al.  Understanding hearing through deafness , 2007, Proceedings of the National Academy of Sciences.

[7]  Carol Y. Espy-Wilson,et al.  Speech enhancement using modified phase opponency model , 2007, INTERSPEECH.

[8]  Brian C J Moore,et al.  Effects of moderate cochlear hearing loss on the ability to benefit from temporal fine structure information in speech. , 2008, The Journal of the Acoustical Society of America.

[9]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[10]  M. Sachs,et al.  The representations of the steady-state vowel sound /e/ in the discharge patterns of cat anteroventral cochlear nucleus neurons. , 1990, Journal of neurophysiology.

[11]  E D Young,et al.  Effects of acoustic trauma on the representation of the vowel "eh" in cat auditory nerve fibers. , 1997, The Journal of the Acoustical Society of America.

[12]  R. Shannon,et al.  Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. , 2001, The Journal of the Acoustical Society of America.

[13]  Brian C J Moore,et al.  Abnormal processing of temporal fine structure in speech for frequencies where absolute thresholds are normal. , 2009, The Journal of the Acoustical Society of America.

[14]  G E Loeb,et al.  Spatial cross-correlation , 1983, Biological Cybernetics.

[15]  J. Swaminathan,et al.  Psychophysiological Analyses Demonstrate the Importance of Neural Envelope Coding for Speech Perception in Noise , 2012, The Journal of Neuroscience.

[16]  R. Drullman Temporal envelope and fine structure cues for speech intelligibility , 1994 .

[17]  G. Stickney,et al.  On the dichotomy in auditory perception between temporal envelope and fine structure cues. , 2004, The Journal of the Acoustical Society of America.

[18]  T. Yin,et al.  Responses to amplitude-modulated tones in the auditory nerve of the cat. , 1992, The Journal of the Acoustical Society of America.

[19]  Brian C J Moore,et al.  The effects of the addition of low-level, low-noise noise on the intelligibility of sentences processed to remove temporal envelope information. , 2010, The Journal of the Acoustical Society of America.

[20]  O Ghitza,et al.  On the upper cutoff frequency of the auditory critical-band envelope detectors in the context of speech perception. , 2001, The Journal of the Acoustical Society of America.

[21]  Christian Lorenzi,et al.  The ability of listeners to use recovered envelope cues from speech fine structure. , 2006, The Journal of the Acoustical Society of America.

[22]  L. Carney Sensitivities of cells in anteroventral cochlear nucleus of cat to spatiotemporal discharge patterns across primary afferents. , 1990, Journal of neurophysiology.

[23]  S. Shamma,et al.  Synchrony suppression in complex stimulus responses of a biophysical model of the cochlea. , 1987, The Journal of the Acoustical Society of America.

[24]  S. Shamma Speech processing in the auditory system. II: Lateral inhibition and the central processing of speech evoked activity in the auditory nerve. , 1985, The Journal of the Acoustical Society of America.

[25]  Shihab Shamma,et al.  Auditory Representations of Timbre and Pitch , 1996 .

[26]  Kuansan Wang,et al.  Auditory representations of acoustic signals , 1992, IEEE Trans. Inf. Theory.

[27]  Les E. Atlas,et al.  Time-Frequency Coherent Modulation Filtering of Nonstationary Signals , 2009, IEEE Transactions on Signal Processing.

[28]  L H Carney,et al.  Effects of interaural time delays of noise stimuli on low-frequency cells in the cat's inferior colliculus. III. Evidence for cross-correlation. , 1987, Journal of neurophysiology.

[29]  E. Rubel,et al.  Dynamic Spike Thresholds during Synaptic Integration Preserve and Enhance Temporal Response Properties in the Avian Cochlear Nucleus , 2010, The Journal of Neuroscience.

[30]  Jayaganesh Swaminathan,et al.  Across-Fiber Coding of Temporal Fine-Structure: Effects of Noise-Induced Hearing Loss on Auditory-Nerve Responses , 2010 .

[31]  Fan-Gang Zeng,et al.  Speech recognition with amplitude and frequency modulations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  D. Moore,et al.  Beyond cochlear implants: awakening the deafened brain , 2009, Nature Neuroscience.

[33]  Christian Lorenzi,et al.  Perception of temporal fine-structure cues in speech with minimal envelope cues for listeners with mild-to-moderate hearing loss , 2010, International journal of audiology.

[34]  M. Sachs,et al.  Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers. , 1979, The Journal of the Acoustical Society of America.

[35]  S. O. Rice Distortion produced by band limitation of an FM wave , 1973 .

[36]  Peggy B Nelson,et al.  Understanding speech in modulated interference: cochlear implant users and normal-hearing listeners. , 2003, The Journal of the Acoustical Society of America.

[37]  C D Geisler,et al.  Responses of auditory-nerve fibers to nasal consonant-vowel syllables. , 1987, The Journal of the Acoustical Society of America.

[38]  M. Sachs,et al.  Encoding of steady-state vowels in the auditory nerve: representation in terms of discharge rate. , 1979, The Journal of the Acoustical Society of America.

[39]  Ian C. Bruce,et al.  Effects of Peripheral Tuning on the Auditory Nerve’s Representation of Speech Envelope and Temporal Fine Structure Cues , 2010 .

[40]  D. H. Johnson,et al.  The relationship between spike rate and synchrony in responses of auditory-nerve fibers to single tones. , 1980, The Journal of the Acoustical Society of America.

[41]  S. Shamma Speech processing in the auditory system. I: The representation of speech sounds in the responses of the auditory nerve. , 1985, The Journal of the Acoustical Society of America.

[42]  Brian C. J. Moore,et al.  Physiological Aspects of Cochlear Hearing Loss , 2008 .

[43]  L H Carney,et al.  Enhancement of neural synchronization in the anteroventral cochlear nucleus. I. Responses to tones at the characteristic frequency. , 1994, Journal of neurophysiology.

[44]  H. Voelcker Toward a unified theory of modulation part I: Phase-envelope relationships , 1966 .

[45]  Christian Lorenzi,et al.  Effects of periodic interruptions on the intelligibility of speech based on temporal fine-structure or envelope cues. , 2007, The Journal of the Acoustical Society of America.

[46]  Michael G. Heinz,et al.  Envelope Coding in Auditory Nerve Fibers Following Noise-Induced Hearing Loss , 2010, Journal of the Association for Research in Otolaryngology.

[47]  R. Plomp,et al.  Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[48]  Deniz Başkent,et al.  Speech recognition in normal hearing and sensorineural hearing loss as a function of the number of spectral channels. , 2006, The Journal of the Acoustical Society of America.

[49]  Emily Buss,et al.  Temporal Fine-Structure Cues to Speech and Pure Tone Modulation in Observers with Sensorineural Hearing Loss , 2004, Ear and hearing.

[50]  B. Logan Information in the zero crossings of bandpass signals , 1977, The Bell System Technical Journal.

[51]  S Shamma,et al.  The case of the missing pitch templates: how harmonic templates emerge in the early auditory system. , 2000, The Journal of the Acoustical Society of America.

[52]  M. Liberman,et al.  Adding Insult to Injury: Cochlear Nerve Degeneration after “Temporary” Noise-Induced Hearing Loss , 2009, The Journal of Neuroscience.

[53]  Zachary M. Smith,et al.  Chimaeric sounds reveal dichotomies in auditory perception , 2002, Nature.

[54]  Bertrand Delgutte,et al.  Spatio-Temporal Representation of the Pitch of Complex Tones in the Auditory Nerve , 2007 .

[55]  Homer Dudley,et al.  A Synthetic Speaker , 1939, Science.

[56]  Christian Lorenzi,et al.  Speech identification based on temporal fine structure cues. , 2008, The Journal of the Acoustical Society of America.

[57]  Christian Lorenzi,et al.  Effects of spectral smearing and temporal fine structure degradation on speech masking release. , 2009, The Journal of the Acoustical Society of America.

[58]  S. Shamma,et al.  An account of monaural phase sensitivity. , 2002, The Journal of the Acoustical Society of America.