Dynamic and Task-Dependent Encoding of Speech and Voice by Phase Reorganization of Cortical Oscillations

Speech and vocal sounds are at the core of human communication. Cortical processing of these sounds critically depends on behavioral demands. However, the neurocomputational mechanisms enabling this adaptive processing remain elusive. Here we examine the task-dependent reorganization of electroencephalographic responses to natural speech sounds (vowels /a/, /i/, /u/) spoken by three speakers (two female, one male) while listeners perform a one-back task on either vowel or speaker identity. We show that dynamic changes of sound-evoked responses and phase patterns of cortical oscillations in the alpha band (8–12 Hz) closely reflect the abstraction and analysis of the sounds along the task-relevant dimension. Vowel categorization leads to a significant temporal realignment of responses to the same vowel, e.g., /a/, independent of who pronounced this vowel, whereas speaker categorization leads to a significant temporal realignment of responses to the same speaker, e.g., speaker 1, independent of which vowel she/he pronounced. This transient and goal-dependent realignment of neuronal responses to physically different external events provides a robust cortical coding mechanism for forming and processing abstract representations of auditory (speech) input.

[1]  Rainer Goebel,et al.  "Who" Is Saying "What"? Brain-Based Decoding of Human Voice and Speech , 2008, Science.

[2]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[3]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[4]  O. Jensen,et al.  Asymmetric Amplitude Modulations of Brain Oscillations Generate Slow Evoked Responses , 2008, The Journal of Neuroscience.

[5]  E. T. Possing,et al.  Human temporal lobe activation by speech and nonspeech sounds. , 2000, Cerebral cortex.

[6]  S. David,et al.  Auditory attention : focusing the searchlight on sound , 2007 .

[7]  T. Sejnowski,et al.  Dynamic Brain Sources of Visual Evoked Responses , 2002, Science.

[8]  M. Laine,et al.  Event-related EEG desynchronization and synchronization during an auditory memory task. , 1996, Electroencephalography and clinical neurophysiology.

[9]  D. Poeppel,et al.  The cortical organization of speech processing , 2007, Nature Reviews Neuroscience.

[10]  Jonas Obleser,et al.  Attentional influences on functional mapping of speech sounds in human auditory cortex , 2004, BMC Neuroscience.

[11]  W. Singer,et al.  Dynamic predictions: Oscillations and synchrony in top–down processing , 2001, Nature Reviews Neuroscience.

[12]  R. Näätänen,et al.  Preattentive voice discrimination by the human brain as indexed by the mismatch negativity , 2001, Neuroscience Letters.

[13]  P. Belin,et al.  Electrophysiological markers of voice familiarity , 2006, The European journal of neuroscience.

[14]  J. Lisman,et al.  Oscillations in the alpha band (9-12 Hz) increase with memory load during retention in a short-term memory task. , 2002, Cerebral cortex.

[15]  Jonas Obleser,et al.  Magnetic Brain Response Mirrors Extraction of Phonological Features from Spoken Vowels , 2004, Journal of Cognitive Neuroscience.

[16]  Shlomo Bentin,et al.  Neural sensitivity to human voices: ERP evidence of task and attentional influences. , 2003, Psychophysiology.

[17]  Werner Lutzenberger,et al.  Dynamics of oscillatory activity during auditory decision making. , 2007, Cerebral cortex.

[18]  D. Poeppel,et al.  Processing of vowels in supratemporal auditory cortex , 1997, Neuroscience Letters.

[19]  J J Hopfield,et al.  What is a moment? Transient synchrony as a collective mechanism for spatiotemporal integration. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  E. Formisano,et al.  Phase coupling in a cerebro-cerebellar network at 8-13 Hz during reading. , 2007, Cerebral cortex.

[21]  E Ahissar,et al.  Speech comprehension is correlated with temporal response patterns recorded from auditory cortex , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  R. Ilmoniemi,et al.  Language-specific phoneme representations revealed by electric and magnetic brain responses , 1997, Nature.

[23]  P. Belin,et al.  Thinking the voice: neural correlates of voice perception , 2004, Trends in Cognitive Sciences.

[24]  R. Salmelin,et al.  Time course of top-down and bottom-up influences on syllable processing in the auditory cortex. , 2006, Cerebral cortex.

[25]  Riitta Salmelin,et al.  Hemispheric balance in processing attended and non-attended vowels and complex tones. , 2003, Brain research. Cognitive brain research.

[26]  Jean-Luc Schwartz,et al.  Parieto-frontal gamma band activity during the perceptual emergence of speech forms , 2008, NeuroImage.

[27]  Terrence J. Sejnowski,et al.  Independent Component Analysis Using an Extended Infomax Algorithm for Mixed Subgaussian and Supergaussian Sources , 1999, Neural Computation.

[28]  T Murry,et al.  Multidimensional analysis of male and female voices. , 1980, The Journal of the Acoustical Society of America.

[29]  W. Klimesch,et al.  EEG alpha oscillations: The inhibition–timing hypothesis , 2007, Brain Research Reviews.

[30]  W. Klimesch,et al.  Event-related phase reorganization may explain evoked neural dynamics , 2007, Neuroscience & Biobehavioral Reviews.

[31]  W. Klimesch,et al.  Event-related desynchronization in the alpha band and the processing of semantic information. , 1997, Brain research. Cognitive brain research.

[32]  T. Sejnowski,et al.  Correlated neuronal activity and the flow of neural information , 2001, Nature Reviews Neuroscience.

[33]  A. Compston The Berger rhythm: potential changes from the occipital lobes in man. , 2010, Brain : a journal of neurology.

[34]  D. Poeppel,et al.  Phase Patterns of Neuronal Responses Reliably Discriminate Speech in Human Auditory Cortex , 2007, Neuron.

[35]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[36]  S. Scott,et al.  Identification of a pathway for intelligible speech in the left temporal lobe. , 2000, Brain : a journal of neurology.

[37]  D. Norris,et al.  Shortlist B: a Bayesian model of continuous speech recognition. , 2008, Psychological review.

[38]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[39]  D. Poeppel,et al.  Task-induced asymmetry of the auditory evoked M100 neuromagnetic field elicited by speech sounds. , 1996, Brain research. Cognitive brain research.

[40]  P. König,et al.  Top-down processing mediated by interareal synchronization. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Wolfgang Klimesch,et al.  A short review of slow phase synchronization and memory: Evidence for control processes in different memory systems? , 2008, Brain Research.

[42]  Elvira Brattico,et al.  Orderly cortical representation of vowel categories presented by multiple exemplars. , 2004, Brain research. Cognitive brain research.

[43]  P. Rossini,et al.  Neuromagnetic functional coupling during dichotic listening of speech sounds , 2008, Human brain mapping.