Representation of speech in human auditory cortex: Is it special?

Successful categorization of phonemes in speech requires that the brain analyze the acoustic signal along both spectral and temporal dimensions. Neural encoding of the stimulus amplitude envelope is critical for parsing the speech stream into syllabic units. Encoding of voice onset time (VOT) and place of articulation (POA), cues necessary for determining phonemic identity, occurs within shorter time frames. An unresolved question is whether the neural representation of speech is based on processing mechanisms that are unique to humans and shaped by learning and experience, or is based on rules governing general auditory processing that are also present in non-human animals. This question was examined by comparing the neural activity elicited by speech and other complex vocalizations in primary auditory cortex of macaques, who are limited vocal learners, with that in Heschl's gyrus, the putative location of primary auditory cortex in humans. Entrainment to the amplitude envelope is neither specific to humans nor to human speech. VOT is represented by responses time-locked to consonant release and voicing onset in both humans and monkeys. Temporal representation of VOT is observed both for isolated syllables and for syllables embedded in the more naturalistic context of running speech. The fundamental frequency of male speakers is represented by more rapid neural activity phase-locked to the glottal pulsation rate in both humans and monkeys. In both species, the differential representation of stop consonants varying in their POA can be predicted by the relationship between the frequency selectivity of neurons and the onset spectra of the speech sounds. These findings indicate that the neurophysiology of primary auditory cortex is similar in monkeys and humans despite their vastly different experience with human speech, and that Heschl's gyrus is engaged in general auditory, and not language-specific, processing. This article is part of a Special Issue entitled "Communication Sounds and the Brain: New Directions and Perspectives".

[1]  J. Camchong,et al.  Using monkeys to explore perceptual "loss" versus "learning" models in English and Spanish voice-onset-time perception. , 2006, The Journal of the Acoustical Society of America.

[2]  U. Goswami A temporal sampling framework for developmental dyslexia , 2011, Trends in Cognitive Sciences.

[3]  Michael Brady,et al.  Improved Optimization for the Robust and Accurate Linear Registration and Motion Correction of Brain Images , 2002, NeuroImage.

[4]  Michael M Merzenich,et al.  Lifelong plasticity in the rat auditory cortex: basic mechanisms and role of sensory experience. , 2011, Progress in brain research.

[5]  J. Obleser,et al.  Auditory evoked fields differentially encode speech features: an MEG investigation of the P50m and N100m time courses during syllable processing , 2007, The European journal of neuroscience.

[6]  Shihab A Shamma,et al.  Task reward structure shapes rapid receptive field plasticity in auditory cortex , 2012, Proceedings of the National Academy of Sciences.

[7]  D B Pisoni,et al.  Some effects of laboratory training on identification and discrimination of voicing contrasts in stop consonants. , 1982, Journal of experimental psychology. Human perception and performance.

[8]  S. David,et al.  Adaptive, behaviorally-gated, persistent encoding of task-relevant auditory information in ferret frontal cortex , 2010, Nature Neuroscience.

[9]  P. Roelfsema,et al.  Chronic multiunit recordings in behaving animals: advantages and limitations. , 2005, Progress in brain research.

[10]  J. Arezzo,et al.  Representation of the voice onset time (VOT) speech parameter in population responses within primary auditory cortex of the awake monkey. , 2003, The Journal of the Acoustical Society of America.

[11]  R. Carlyon,et al.  Comparing the fundamental frequencies of resolved and unresolved harmonics: Evidence for two pitch mechanisms? , 1994 .

[12]  R. Metherate,et al.  Facilitation of an NMDA receptor‐mediated EPSP by paired‐pulse stimulation in rat neocortex via depression of GABAergic IPSPs. , 1994, The Journal of physiology.

[13]  A. Lotto,et al.  Speech perception as categorization , 2010, Attention, perception & psychophysics.

[14]  Spectral tilt change in stop consonant perception by listeners with hearing impairment. , 2009, Journal of speech, language, and hearing research : JSLHR.

[15]  Mitchell Steinschneider,et al.  Coding of repetitive transients by auditory cortex on Heschl's gyrus. , 2009, Journal of neurophysiology.

[16]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.

[17]  R. Carlyon,et al.  The role of resolved and unresolved harmonics in pitch perception and frequency modulation discrimination. , 1994, The Journal of the Acoustical Society of America.

[18]  Mitchell Steinschneider,et al.  Temporally dynamic frequency tuning of population responses in monkey primary auditory cortex , 2009, Hearing Research.

[19]  M. Scherg,et al.  Morphology of Heschl's gyrus reflects enhanced activation in the auditory cortex of musicians , 2002, Nature Neuroscience.

[20]  S. Cruikshank,et al.  Auditory thalamocortical synaptic transmission in vitro. , 2002, Journal of neurophysiology.

[21]  M Steinschneider,et al.  Click train encoding in primary auditory cortex of the awake monkey: evidence for two mechanisms subserving pitch perception. , 1998, The Journal of the Acoustical Society of America.

[22]  H. Scheich,et al.  Click train encoding in primary and non-primary auditory cortex of anesthetized macaque monkeys , 2008, Neuroscience.

[23]  J Bertoncini,et al.  Discrimination in neonates of very short CVs. , 1987, The Journal of the Acoustical Society of America.

[24]  Mary Sue Younger,et al.  Perceptual weighting of stop consonant cues by normal and impaired listeners in reverberation versus noise. , 2007, Journal of speech, language, and hearing research : JSLHR.

[25]  J. Kaas,et al.  Architectonic identification of the core region in auditory cortex of macaques, chimpanzees, and humans , 2001, The Journal of comparative neurology.

[26]  B. Rosner,et al.  Could temporal order differences underlie 2-month-olds' discrimination of English voicing contrasts? , 1989, The Journal of the Acoustical Society of America.

[27]  Daniel Bendor,et al.  Dual-Pitch Processing Mechanisms in Primate Auditory Cortex , 2012, The Journal of Neuroscience.

[28]  R. Lasky,et al.  VOT discrimination by four to six and a half month old infants from Spanish environments. , 1975, Journal of experimental child psychology.

[29]  David Poeppel,et al.  Towards a New Neurobiology of Language , 2012, The Journal of Neuroscience.

[30]  L. Lisker,et al.  Letter: Is it VOT or a first-formant transition detector? , 1975, The Journal of the Acoustical Society of America.

[31]  C. Nicholson,et al.  Experimental optimization of current source-density technique for anuran cerebellum. , 1975, Journal of neurophysiology.

[32]  Catherine Liégeois-Chauvel,et al.  Hemispheric lateralization of voice onset time (VOT) comparison between depth and scalp EEG recordings , 2005, NeuroImage.

[33]  G. V. Simpson,et al.  Cellular generators of the cortical auditory evoked potential initial component. , 1992, Electroencephalography and clinical neurophysiology.

[34]  J. Eggermont Representation of spectral and temporal sound features in three cortical fields of the cat. Similarities outweigh differences. , 1998, Journal of neurophysiology.

[35]  A. Liberman,et al.  On the relation of speech to language , 2000, Trends in Cognitive Sciences.

[36]  D. Pisoni,et al.  Acoustic-phonetic representations in word recognition , 1987, Cognition.

[37]  C. Schreiner,et al.  Thalamocortical transformation of responses to complex auditory stimuli , 2004, Experimental Brain Research.

[38]  Patrick Chauvel,et al.  Temporal envelope processing in the human left and right auditory cortices. , 2004, Cerebral cortex.

[39]  Ira J. Hirsh,et al.  Auditory Perception of Temporal Order , 1959 .

[40]  Jeffrey S. Johnson,et al.  Coding of amplitude modulation in primary auditory cortex. , 2011, Journal of neurophysiology.

[41]  Christoph E. Schreiner,et al.  Spatial Distribution of Responses to Simple and Complex Sounds in the Primary Auditory Cortex , 1998, Audiology and Neurotology.

[42]  S. Blumstein,et al.  Acoustic invariance in speech production: evidence from measurements of the spectral characteristics of stop consonants. , 1979, The Journal of the Acoustical Society of America.

[43]  Michael D. Hunter,et al.  Male and female voices activate distinct regions in the male brain , 2005, NeuroImage.

[44]  J. Flanagan,et al.  Pitch of Periodic Pulses , 1959 .

[45]  J J Eggermont,et al.  Neural correlates of gap detection and auditory fusion in cat auditory cortex , 1995, Neuroreport.

[46]  Kenneth Ward Church,et al.  Phonological parsing and lexical retrieval , 1987, Cognition.

[47]  S. Blumstein,et al.  A reconsideration of acoustic invariance for place of articulation in diffuse stop consonants: evidence from a cross-language study. , 1981, The Journal of the Acoustical Society of America.

[48]  Stephen V. David,et al.  Attention and Dynamic, Task-Related Receptive Field Plasticity in Adult Auditory Cortex , 2013 .

[49]  S. Blumstein,et al.  Perceptual invariance and onset spectra for stop consonants in different vowel environments , 1976 .

[50]  Nina Kraus,et al.  Objective Neural Indices of Speech-in-Noise Perception , 2010, Trends in amplification.

[51]  Mitchell Steinschneider,et al.  Intracortical responses in human and monkey primary auditory cortex support a temporal processing mechanism for encoding of the voice onset time phonetic parameter. , 2004, Cerebral cortex.

[52]  P. Chauvel,et al.  Specialization of left auditory cortex for speech perception in man depends on temporal coding. , 1999, Cerebral cortex.

[53]  Christian Gaser,et al.  Differ between Musicians and NonMusicians , 2003 .

[54]  Xiaoqin Wang,et al.  Neural representations of sinusoidal amplitude and frequency modulations in the primary auditory cortex of awake primates. , 2002, Journal of neurophysiology.

[55]  C S Watson,et al.  Auditory temporal acuity in relation to category boundaries; speech and nonspeech stimuli. , 1988, The Journal of the Acoustical Society of America.

[56]  R. Adolphs,et al.  Electrophysiological Responses in the Human Amygdala Discriminate Emotion Categories of Complex Visual Stimuli , 2002, The Journal of Neuroscience.

[57]  T D Carrell,et al.  Onset spectra and formant transitions in the adult's and child's perception of place of articulation in stop consonants. , 1983, The Journal of the Acoustical Society of America.

[58]  Daniel Bendor,et al.  Differential neural coding of acoustic flutter within primate auditory cortex , 2007, Nature Neuroscience.

[59]  D. Bendor,et al.  Neural coding of temporal information in auditory thalamus and cortex , 2008, Neuroscience.

[60]  P. D. Eimas,et al.  Speech Perception in Infants , 1971, Science.

[61]  Christopher J. Plack,et al.  Differences in fundamental frequency discrimination and frequency modulation detection between complex tones consisting of resolved and unresolved harmonics , 1995 .

[62]  L. Lisker,et al.  A Cross-Language Study of Voicing in Initial Stops: Acoustical Measurements , 1964 .

[63]  A. Monaco,et al.  Genetic Advances in the Study of Speech and Language Disorders , 2010, Neuron.

[64]  A. Friederici Neurophysiological markers of early language acquisition: from syllables to sentences , 2005, Trends in Cognitive Sciences.

[65]  S. Blumstein,et al.  Perceptual invariance and onset spectra for stop consonants in different vowel environments. , 1980, The Journal of the Acoustical Society of America.

[66]  James L. Flanagan,et al.  On the Pitch of Periodic Pulses , 1960 .

[67]  A. Bieser,et al.  Processing of twitter-call fundamental frequencies in insula and auditory cortex of squirrel monkeys , 1998, Experimental Brain Research.

[68]  R. L. Rennaker,et al.  Response to broadband repetitive stimuli in auditory cortex of the unanesthetized rat , 2006, Hearing Research.

[69]  Xiaoqin Wang,et al.  Sustained firing in auditory cortex evoked by preferred stimuli , 2005, Nature.

[70]  J. Eggermont Neural responses in primary auditory cortex mimic psychophysical, across-frequency-channel, gap-detection thresholds. , 2000, Journal of neurophysiology.

[71]  J. Obleser,et al.  Pre-lexical abstraction of speech in the auditory cortex , 2009, Trends in Cognitive Sciences.

[72]  P. Kuhl Early language acquisition: cracking the speech code , 2004, Nature Reviews Neuroscience.

[73]  Christopher K. Kovach,et al.  Temporal Envelope of Time-Compressed Speech Represented in the Human Auditory Cortex , 2009, The Journal of Neuroscience.

[74]  E Ahissar,et al.  Speech comprehension is correlated with temporal response patterns recorded from auditory cortex , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[75]  Mitchell Steinschneider,et al.  Intracranial study of speech-elicited activity on the human posterolateral superior temporal gyrus. , 2011, Cerebral cortex.

[76]  K. Saberi,et al.  Cognitive restoration of reversed speech , 1999, Nature.

[77]  D. Poeppel,et al.  Speech perception at the interface of neurobiology and linguistics , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[78]  C. Schroeder,et al.  Physiologic Correlates of the Voice Onset Time Boundary in Primary Auditory Cortex (A1) of the Awake Monkey: Temporal Response Patterns , 1995, Brain and Language.

[79]  Kenneth N. Stevens,et al.  Constraints Imposed by the Auditory System on the Properties Used to Classify Speech Sounds: Data from Phonology, Acoustics, and Psychoacoustics , 1981 .

[80]  Xiaoqin Wang,et al.  Temporal and rate representations of time-varying signals in the auditory cortex of awake primates , 2001, Nature Neuroscience.

[81]  P. Tallal Improving language and literacy is a matter of time , 2004, Nature Reviews Neuroscience.

[82]  Tamara C. Cristescu,et al.  Auditory language comprehension of temporally reversed speech signals in native and non-native speakers. , 2008, Acta Neurobiologiae Experimentalis.

[83]  Q. Summerfield,et al.  On the dissociation of spectral and temporal cues to the voicing distinction in initial stop consonants. , 1977, The Journal of the Acoustical Society of America.

[84]  Christopher K. Kovach,et al.  Coding of repetitive transients by auditory cortex on posterolateral superior temporal gyrus in humans: an intracranial electrophysiology study. , 2013, Journal of neurophysiology.

[85]  J D Miller,et al.  Speech perception by the chinchilla: identification function for synthetic VOT stimuli. , 1978, The Journal of the Acoustical Society of America.

[86]  C. Schroeder,et al.  Speech-evoked activity in primary auditory cortex: effects of voice onset time. , 1994, Electroencephalography and clinical neurophysiology.

[87]  Joachim Gross,et al.  Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension , 2012, Cerebral cortex.

[88]  P. Kuhl Brain Mechanisms in Early Language Acquisition , 2010, Neuron.

[89]  Y. Cohen,et al.  Representation of speech categories in the primate auditory cortex. , 2011, Journal of neurophysiology.

[90]  D Kewley-Port,et al.  Time-varying features as correlates of place of articulation in stop consonants. , 1983, The Journal of the Acoustical Society of America.

[91]  N. Viemeister,et al.  Noncategorical perception of stop consonants differing in VOT. , 1977, The Journal of the Acoustical Society of America.

[92]  U. Mitzdorf,et al.  Functional anatomy of the inferior colliculus and the auditory cortex: current source density analyses of click-evoked potentials , 1984, Hearing Research.

[93]  Steven Greenberg,et al.  Speech Processing in the Auditory System: An Overview , 2004 .

[94]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[95]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[96]  W. Ganong Phonetic categorization in auditory word perception. , 1980, Journal of experimental psychology. Human perception and performance.

[97]  Maritza Rivera-Gaxiola,et al.  Neural substrates of language acquisition. , 2008, Annual review of neuroscience.

[98]  R. Diehl,et al.  Speech Perception , 2004, Annual review of psychology.

[99]  Sarah M N Woolley,et al.  Early experience shapes vocal neural coding and perception in songbirds. , 2012, Developmental psychobiology.

[100]  M. Steinschneider,et al.  Searching for the Mismatch Negativity in Primary Auditory Cortex of the Awake Monkey: Deviance Detection or Stimulus Specific Adaptation? , 2012, The Journal of Neuroscience.

[101]  E. Chang,et al.  Categorical Speech Representation in Human Superior Temporal Gyrus , 2010, Nature Neuroscience.

[102]  N. Mesgarani,et al.  Selective cortical representation of attended speaker in multi-talker speech perception , 2012, Nature.

[103]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[104]  Robert C. Liu,et al.  Auditory Cortical Detection and Discrimination Correlates with Communicative Significance , 2007, PLoS biology.

[105]  D. Pisoni Identification and discrimination of the relative onset time of two component tones: implications for voicing perception in stops. , 1977, The Journal of the Acoustical Society of America.

[106]  S. Cruikshank,et al.  Thalamocortical inputs trigger a propagating envelope of gamma-band activity in auditory cortex in vitro , 1999, Experimental Brain Research.

[107]  Jan W. H. Schnupp,et al.  Plasticity of Temporal Pattern Codes for Vocalization Stimuli in Primary Auditory Cortex , 2006, The Journal of Neuroscience.

[108]  Stefano Panzeri,et al.  The Laminar and Temporal Structure of Stimulus Information in the Phase of Field Potentials of Auditory Cortex , 2011, The Journal of Neuroscience.

[109]  Jonas Obleser,et al.  Now you hear it, now you don't: transient traces of consonants and their nonspeech analogues in the human brain. , 2006, Cerebral cortex.

[110]  U. Goswami,et al.  Rise time and formant transition duration in the discrimination of speech sounds: the Ba-Wa distinction in developmental dyslexia. , 2011, Developmental science.

[111]  T. Carrell,et al.  Acoustic elements of speechlike stimuli are reflected in surface recorded responses over the guinea pig temporal lobe. , 1996, The Journal of the Acoustical Society of America.

[112]  I. Fried,et al.  Ultra-fine frequency tuning revealed in single neurons of human auditory cortex , 2008, Nature.

[113]  C. Schreiner,et al.  Representation of amplitude modulation in the auditory cortex of the cat. II. Comparison between cortical fields , 1988, Hearing Research.

[114]  Joshua M Alexander,et al.  Spectral tilt change in stop consonant perception. , 2008, The Journal of the Acoustical Society of America.

[115]  Kenneth N Stevens,et al.  Toward a model for lexical access based on acoustic landmarks and distinctive features. , 2002, The Journal of the Acoustical Society of America.

[116]  John C Middlebrooks,et al.  Auditory cortex phase locking to amplitude-modulated cochlear implant pulse trains. , 2008, Journal of neurophysiology.

[117]  J. Flanagan,et al.  Pitch of Periodic Pulses without Fundamental Component , 1959 .

[118]  Jan Wouters,et al.  Adults with dyslexia are impaired in categorizing speech and nonspeech sounds on the basis of temporal cues , 2010, Proceedings of the National Academy of Sciences.

[119]  N. Kraus,et al.  Context-Dependent Encoding in the Human Auditory Brainstem Relates to Hearing Speech in Noise: Implications for Developmental Dyslexia , 2009, Neuron.

[120]  S. Blumstein,et al.  Invariant cues for place of articulation in stop consonants. , 1978, The Journal of the Acoustical Society of America.

[121]  David Poeppel,et al.  Cortical oscillations and speech processing: emerging computational principles and operations , 2012, Nature Neuroscience.

[122]  Xiaoqin Wang Neural coding strategies in auditory cortex , 2007, Hearing Research.

[123]  R E Eilers,et al.  Linguistic experience and phonemic perception in infancy: a crosslinguistic study. , 1979, Child development.

[124]  M Steinschneider,et al.  Temporal encoding of the voice onset time phonetic parameter by field potentials recorded directly from human auditory cortex. , 1999, Journal of neurophysiology.

[125]  Richard S. J. Frackowiak,et al.  Representation of the temporal envelope of sounds in the human brain. , 2000, Journal of neurophysiology.

[126]  C. Petkov,et al.  Birds, primates, and spoken language origins: behavioral phenotypes and neurobiological substrates , 2012, Front. Evol. Neurosci..

[127]  C. Schreiner,et al.  Representation of amplitude modulation in the auditory cortex of the cat. I. The anterior auditory field (AAF) , 1986, Hearing Research.

[128]  M. Steinschneider,et al.  Enhanced physiologic discriminability of stop consonants with prolonged formant transitions in awake monkeys based on the tonotopic organization of primary auditory cortex , 2011, Hearing Research.

[129]  A Faulkner,et al.  Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audio-visual and auditory speech perception. , 1999, The Journal of the Acoustical Society of America.

[130]  L P Shapiro,et al.  "How to milk a coat:" the effects of semantic and acoustic information on phoneme categorization. , 1998, The Journal of the Acoustical Society of America.

[131]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[132]  J. Kaas,et al.  Tonotopic organization, architectonic fields, and connections of auditory cortex in macaque monkeys , 1993, The Journal of comparative neurology.

[133]  M M Merzenich,et al.  Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: temporal and spectral characteristics. , 1995, Journal of neurophysiology.

[134]  C. Schroeder,et al.  Tonotopic organization of responses reflecting stop consonant place of articulation in primary auditory cortex (A1) of the monkey , 1995, Brain Research.

[135]  Daniel R. Hansen,et al.  A method for placing Heschl gyrus depth electrodes. , 2010, Journal of Neurosurgery.

[136]  J M Sinnott,et al.  Differences in human and monkey sensitivity to acoustic cues underlying voicing contrasts. , 1987, The Journal of the Acoustical Society of America.

[137]  A. Lotto,et al.  Perceptual compensation for coarticulation by Japanese quail (Coturnix coturnix japonica). , 1997, The Journal of the Acoustical Society of America.

[138]  Kirill V Nourski,et al.  Representation of temporal sound features in the human auditory cortex , 2011, Reviews in the neurosciences.

[139]  Nima Mesgarani,et al.  Phoneme representation and classification in primary auditory cortex. , 2008, The Journal of the Acoustical Society of America.

[140]  A. Palmer,et al.  Processing of Communication Calls in Guinea Pig Auditory Cortex , 2012, PloS one.

[141]  R. Eckhorn,et al.  Stimulus-dependent modulations of correlated high-frequency oscillations in cat visual cortex. , 1997, Cerebral cortex.

[142]  J. Eggermont Temporal modulation transfer functions in cat primary auditory cortex: separating stimulus effects from neural mechanisms. , 2002, Journal of neurophysiology.

[143]  J. Simon,et al.  Emergence of neural encoding of auditory objects while listening to competing speakers , 2012, Proceedings of the National Academy of Sciences.

[144]  M. Kilgard,et al.  Cortical activity patterns predict speech discrimination ability , 2008, Nature Neuroscience.

[145]  Joanne L. Miller,et al.  Speech Perception , 1990, Springer Handbook of Auditory Research.