Distorting temporal fine structure by phase shifting and its effects on speech intelligibility and neural phase locking

Envelope (E) and temporal fine structure (TFS) are important features of acoustic signals and their corresponding perceptual function has been investigated with various listening tasks. To further understand the underlying neural processing of TFS, experiments in humans and animals were conducted to demonstrate the effects of modifying the TFS in natural speech sentences on both speech recognition and neural coding. The TFS of natural speech sentences was modified by distorting the phase and maintaining the magnitude. Speech intelligibility was then tested for normal-hearing listeners using the intact and reconstructed sentences presented in quiet and against background noise. Sentences with modified TFS were then used to evoke neural activity in auditory neurons of the inferior colliculus in guinea pigs. Our study demonstrated that speech intelligibility in humans relied on the periodic cues of speech TFS in both quiet and noisy listening conditions. Furthermore, recordings of neural activity from the guinea pig inferior colliculus have shown that individual auditory neurons exhibit phase locking patterns to the periodic cues of speech TFS that disappear when reconstructed sounds do not show periodic patterns anymore. Thus, the periodic cues of TFS are essential for speech intelligibility and are encoded in auditory neurons by phase locking.

[1]  R. Galamboš,et al.  THE RESPONSE OF SINGLE AUDITORY-NERVE FIBERS TO ACOUSTIC STIMULATION , 1943 .

[2]  A. Palmer,et al.  Phase-locking in the cochlear nerve of the guinea-pig and its relation to the receptor potential of inner hair-cells , 1986, Hearing Research.

[3]  Alan R Palmer,et al.  Phase-locked responses to pure tones in the inferior colliculus. , 2006, Journal of neurophysiology.

[4]  T. Houtgast,et al.  On the significance of phase in the short term Fourier spectrum for speech intelligibility. , 2010, The Journal of the Acoustical Society of America.

[5]  B C Moore,et al.  Simulation of the effects of loudness recruitment on the intelligibility of speech in noise. , 1995, British journal of audiology.

[6]  I. Lehiste chapter 7 – Suprasegmental Features of Speech , 1976 .

[7]  S. Rosen Temporal information in speech: acoustic, auditory and linguistic aspects. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[8]  Joseph T. Walsh,et al.  Optical stimulation of auditory neurons: Effects of acute and chronic deafening , 2008, Hearing Research.

[9]  Stuart Rosen,et al.  The role of periodicity in perceiving speech in quiet and in background noise. , 2015, The Journal of the Acoustical Society of America.

[10]  P Dallos,et al.  Compound action potential (AP) tuning curves. , 1976, The Journal of the Acoustical Society of America.

[11]  J J Eggermont,et al.  Compound actionpotential tuning curves in normal and pathological human ears. , 1977, The Journal of the Acoustical Society of America.

[12]  Brian C J Moore,et al.  Speech perception problems of the hearing impaired reflect inability to use temporal fine structure , 2006, Proceedings of the National Academy of Sciences.

[13]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[14]  D. H. Johnson,et al.  The relationship between spike rate and synchrony in responses of auditory-nerve fibers to single tones. , 1980, The Journal of the Acoustical Society of America.

[15]  B. Moore,et al.  The role of temporal fine structure in harmonic segregation through mistuning. , 2010, The Journal of the Acoustical Society of America.

[16]  Gerald Langner,et al.  Periodicity coding in the auditory system , 1992, Hearing Research.

[17]  M F Dorman,et al.  The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6-20 channels. , 1998, The Journal of the Acoustical Society of America.

[18]  Zachary M. Smith,et al.  Chimaeric sounds reveal dichotomies in auditory perception , 2002, Nature.

[19]  J. Galvin,et al.  The Role of Spectral and Temporal Cues in Voice Gender Discrimination by Normal-Hearing Listeners and Cochlear Implant Users , 2004, Journal of the Association for Research in Otolaryngology.

[20]  Damir Čemerin,et al.  IV , 2011 .

[21]  Brian C. J. Moore,et al.  Effect of loudness recruitment on the perception of amplitude modulation , 1996 .

[22]  B. Delgutte,et al.  Speech coding in the auditory nerve: V. Vowels in background noise. , 1984, The Journal of the Acoustical Society of America.

[23]  B. Delgutte Speech coding in the auditory nerve: II. Processing schemes for vowel-like sounds. , 1984, The Journal of the Acoustical Society of America.

[24]  S. Whiteside,et al.  Identification of a Speaker's Sex: A Study of Vowels , 1998, Perceptual and motor skills.

[25]  Jayaganesh Swaminathan,et al.  The role of recovered envelope cues in the identification of temporal-fine-structure speech for hearing-impaired listeners. , 2015, The Journal of the Acoustical Society of America.

[26]  M. Heinz,et al.  Noise-induced hearing loss increases the temporal precision of complex envelope coding by auditory-nerve fibers , 2014, Front. Syst. Neurosci..

[27]  J. C. R. Licklider,et al.  Effects of Amplitude Distortion upon the Intelligibility of Speech , 1946 .

[28]  B. Delgutte,et al.  Speech coding in the auditory nerve: III. Voiceless fricative consonants. , 1984, The Journal of the Acoustical Society of America.

[29]  Eric Javel,et al.  Physiological and psychophysical correlates of temporal processes in hearing , 1988, Hearing Research.

[30]  C E Schreiner,et al.  Neural processing of amplitude-modulated sounds. , 2004, Physiological reviews.

[31]  Brian C. J. Moore,et al.  Development and Current Status of the “Cambridge” Loudness Models , 2014, Trends in hearing.

[32]  Stuart Rosen,et al.  Enhancement of temporal periodicity cues in cochlear implants: effects on prosodic perception and vowel identification. , 2005, The Journal of the Acoustical Society of America.

[33]  Effects of stimulus rate and number on the early components of the averaged electroencephalic response. , 1972, Journal of speech and hearing research.

[34]  Christopher A Shera,et al.  Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Don H. Johnson,et al.  The response of single auditory-nerve fibers in the cat to single tones: synchrony and average discharge rate , 1974 .

[36]  Ernst Terhardt,et al.  Facts and Models in Hearing , 1974 .

[37]  R. Klinke,et al.  HEARING — Physiological Bases and Psychophysics , 1983, Springer Berlin Heidelberg.

[38]  Helen M. Jackson,et al.  The dominant region for the pitch of complex tones with low fundamental frequencies. , 2013, The Journal of the Acoustical Society of America.

[39]  J. P. Wilson,et al.  THE FREQUENCY SELECTIVITY OF THE COCHLEA , 1973 .

[40]  R. Fay,et al.  Hearing in Vertebrates: A Psychophysics Databook , 1988 .

[41]  Claus-Peter Richter,et al.  Optical Stimulation of the Auditory Nerve , 2011 .

[42]  Shihab Shamma,et al.  On the balance of envelope and temporal fine structure in the encoding of speech in the early auditory system. , 2013, The Journal of the Acoustical Society of America.

[43]  D. T. Ives,et al.  Optimal Combination of Neural Temporal Envelope and Fine Structure Cues to Explain Speech Identification in Background Noise , 2014, The Journal of Neuroscience.

[44]  R. Plomp The Role of Modulation in Hearing , 1983 .

[45]  B. Moore,et al.  Simulation of the effects of loudness recruitment and threshold elevation on the intelligibility of speech in quiet and in a background of speech. , 1993, The Journal of the Acoustical Society of America.

[46]  S. Rosen,et al.  Effects of acoustic periodicity and intelligibility on the neural oscillations in response to speech , 2017, Neuropsychologia.

[47]  B. Delgutte,et al.  Speech coding in the auditory nerve: I. Vowel-like sounds. , 1984, The Journal of the Acoustical Society of America.

[48]  J. Goldberg,et al.  Discharge characteristics of neurons in anteroventral and dorsal cochlear nuclei of cat. , 1973, Brain research.

[49]  Kenneth S Henry,et al.  Distorted Tonotopic Coding of Temporal Envelope and Fine Structure with Noise-Induced Hearing Loss , 2016, The Journal of Neuroscience.

[50]  D. Pisoni,et al.  Speech perception without traditional speech cues. , 1981, Science.

[51]  B. Delgutte,et al.  Speech coding in the auditory nerve: IV. Sounds with consonant-like dynamic characteristics. , 1984, The Journal of the Acoustical Society of America.

[52]  R. Plomp Pitch of complex tones. , 1966, The Journal of the Acoustical Society of America.

[53]  Identification of a Speaker's Sex: A Fricative Study , 1998, Perceptual and motor skills.

[54]  M F Dorman,et al.  The Identification of Consonants and Vowels by Cochlear Implant Patients Using a 6‐Channel Continuous Interleaved Sampling Processor and by Normal‐Hearing Subjects Using Simulations of Processors with Two to Nine Channels , 1998, Ear and hearing.

[55]  L. Braida,et al.  Consonant identification in noise using Hilbert-transform temporal fine-structure speech and recovered-envelope speech for listeners with normal and impaired hearing. , 2015, The Journal of the Acoustical Society of America.

[56]  M. Sachs,et al.  Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers. , 1979, The Journal of the Acoustical Society of America.

[57]  P. Strevens Iii , 1985 .

[58]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[59]  The Representation of Tones and Combination Tones in Spike Discharge Patterns of Single Cochlear Nerve Fibers , 1974 .

[60]  Stuart Rosen,et al.  Effects of acoustic periodicity, intelligibility, and pre-stimulus alpha power on the event-related potentials in response to speech , 2017, Brain and Language.

[61]  Tammo Houtgast,et al.  A detailed study on the effects of noise on speech intelligibility. , 2007, The Journal of the Acoustical Society of America.

[62]  E. M. Burns,et al.  Played-again SAM: Further observations on the pitch of amplitude-modulated noise , 1981 .