Neural responses to uninterrupted natural speech can be extracted with precise temporal resolution

The human auditory system has evolved to efficiently process individual streams of speech. However, obtaining temporally detailed responses to distinct continuous natural speech streams has hitherto been impracticable using standard neurophysiological techniques. Here a method is described which provides for the estimation of a temporally precise electrophysiological response to uninterrupted natural speech. We have termed this response AESPA (Auditory Evoked Spread Spectrum Analysis) and it represents an estimate of the impulse response of the auditory system. It is obtained by assuming that the recorded electrophysiological function represents a convolution of the amplitude envelope of a continuous speech stream with the to‐be‐estimated impulse response. We present examples of these responses using both scalp and intracranially recorded human EEG, which were obtained while subjects listened to a binaurally presented recording of a male speaker reading naturally from a classic work of fiction. This method expands the arsenal of stimulation types that can now be effectively used to derive auditory evoked responses and allows for the use of considerably more ecologically valid stimulation parameters. Some implications for future research efforts are presented.

[1]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[2]  Zachary M. Smith,et al.  Chimaeric sounds reveal dichotomies in auditory perception , 2002, Nature.

[3]  Elissa L. Newport,et al.  Segmenting nonsense: an event-related potential index of perceived onsets in continuous speech , 2002, Nature Neuroscience.

[4]  O. Bertrand,et al.  Effects of Selective Attention on the Electrophysiological Representation of Concurrent Sounds in the Human Auditory Cortex , 2007, The Journal of Neuroscience.

[5]  Terence W Picton,et al.  Human temporal auditory acuity as assessed by envelope following responses. , 2004, The Journal of the Acoustical Society of America.

[6]  John J. Foxe,et al.  Seeing voices: High-density electrical mapping and source-analysis of the multisensory mismatch negativity evoked during the McGurk illusion , 2007, Neuropsychologia.

[7]  S. Hillyard,et al.  Human auditory evoked potentials. I. Evaluation of components. , 1974, Electroencephalography and clinical neurophysiology.

[8]  C. C. Wood,et al.  Auditory Evoked Potentials during Speech Perception , 1971, Science.

[9]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[10]  J. Cutting Auditory and linguistic processes in speech perception: inferences from six fusions in dichotic listening. , 1976, Psychological review.

[11]  P. Hagoort,et al.  Integration of Word Meaning and World Knowledge in Language Comprehension , 2004, Science.

[12]  S. Hillyard,et al.  Auditory evoked potentials during selective listening to dichotic speech messages , 1976 .

[13]  K E Hecox,et al.  Temporal Masking of Human Auditory Evoked Brain Stem Responses Using Two Simultaneously Presented Maximum Length Sequences , 1993, Ear and hearing.

[14]  John J. Foxe,et al.  Resolving precise temporal processing properties of the auditory system using continuous stimuli. , 2009, Journal of neurophysiology.

[15]  M. Kutas,et al.  Brain potentials during reading reflect word expectancy and semantic association , 1984, Nature.

[16]  Angela D. Friederici,et al.  Brain potentials indicate immediate use of prosodic cues in natural speech processing , 1999, Nature Neuroscience.

[17]  Barak A. Pearlmutter,et al.  Dissecting the cellular contributions to early visual sensory processing deficits in schizophrenia using the VESPA evoked response , 2008, Schizophrenia Research.

[18]  T. C. Rand,et al.  Letter: Dichotic release from masking for speech. , 1974, The Journal of the Acoustical Society of America.

[19]  T W Picton,et al.  Human auditory evoked potentials recorded using maximum length sequences. , 1992, Electroencephalography and clinical neurophysiology.

[20]  John J. Foxe,et al.  Auditory Scene Analysis: the interaction of stimulation rate and frequency separation on pre‐attentive grouping , 2008, The European journal of neuroscience.

[21]  Barak A. Pearlmutter,et al.  Isolating endogenous visuo-spatial attentional effects using the novel visual-evoked spread spectrum analysis (VESPA) technique , 2007, The European journal of neuroscience.

[22]  John J. Foxe,et al.  Preattentively grouped tones do not elicit MMN with respect to each other. , 2006, Psychophysiology.

[23]  W. Ritter,et al.  An investigation of the auditory streaming effect using event-related brain potentials. , 1999, Psychophysiology.

[24]  Terence W. Picton,et al.  Envelope and spectral frequency-following responses to vowel sounds , 2008, Hearing Research.

[25]  T. Picton,et al.  Human Cortical Responses to the Speech Envelope , 2008, Ear and hearing.

[26]  John J. Foxe,et al.  Auditory processing in schizophrenia during the middle latency period (10-50 ms): high-density electrical mapping and source analysis reveal subcortical antecedents to early cortical deficits. , 2007, Journal of psychiatry & neuroscience : JPN.

[27]  Blaise Yvert,et al.  Localization of human supratemporal auditory areas from intracerebral auditory evoked potentials using distributed source models , 2005, NeuroImage.

[28]  J. Eggermont,et al.  Auditory Evoked Potentials: Basic Principles and Clinical Application , 2006 .

[29]  R. Burkard Human Auditory Evoked Potentials , 2010 .

[30]  T. C. Rand,et al.  Dichotic release from masking for speech , 1974 .

[31]  Barak A. Pearlmutter,et al.  The VESPA: A method for the rapid estimation of a visual evoked potential , 2006, NeuroImage.

[32]  Y. W. Lee,et al.  Measurement of the Wiener Kernels of a Non-linear System by Cross-correlation† , 1965 .