Single-cell activity in human STG during perception of phonemes is organized according to manner of articulation

A long-standing controversy persists in psycholinguistic research regarding the way phonemes are coded in human auditory cortex during speech perception. Whereas the motor theory of speech perception suggests that phonemes are organized in terms of common articulatory gestures that generate them, auditory theories argue that phonetic processing is organized based on common spectro-temporal patterns in phoneme waveforms. Here, we recorded spiking activity in the superior temporal gyrus (STG) from six neurosurgical patients who performed a listening task with phoneme stimuli. Using a Naïve-Bayes model, we show that single-cell responses to phonemes are governed by articulatory features that have acoustic correlates (manner-of-articulation) and organized according to sonority, with two main clusters for sonorants and obstruents. We further find that ‘neural similarity’ (i.e. the similarity of evoked spiking activity between pairs of phonemes), is comparable to the ‘perceptual similarity’ (i.e. how much the pair of phonemes sound similar) based on perceptual confusion assessed behaviorally in healthy subjects. Thus phonemes that were perceptually similar, also had similar neural responses. Our findings establish that phonemes are encoded according to manner-of-articulation, supporting the auditory theories of perception, and that the perceptual representation of phonemes can be reflected by the activity of single neurons in STG.

[1]  Tony A. Fields,et al.  Cerebral microdialysis combined with single-neuron and electroencephalographic recording in neurosurgical patients. Technical note. , 1999, Journal of neurosurgery.

[2]  Rainer Goebel,et al.  "Who" Is Saying "What"? Brain-Based Decoding of Human Voice and Speech , 2008, Science.

[3]  M. Garrett,et al.  Lexical retrieval and its breakdown in aphasia and developmental language impairment , 2013 .

[4]  Antje S. Meyer,et al.  An MEG Study of Picture Naming , 1998, Journal of Cognitive Neuroscience.

[5]  Kenneth N Stevens,et al.  Toward a model for lexical access based on acoustic landmarks and distinctive features. , 2002, The Journal of the Acoustical Society of America.

[6]  H. Batjer,et al.  Current results of the surgical management of aneurysms of the basilar apex. , 1999, Neurosurgery.

[7]  N. Mesgarani,et al.  Dynamic Encoding of Acoustic Features in Neural Responses to Continuous Speech , 2017, The Journal of Neuroscience.

[8]  Christopher K. Kovach,et al.  Temporal Envelope of Time-Compressed Speech Represented in the Human Auditory Cortex , 2009, The Journal of Neuroscience.

[9]  E Ahissar,et al.  Speech comprehension is correlated with temporal response patterns recorded from auditory cortex , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  D. Poeppel,et al.  Speech perception at the interface of neurobiology and linguistics , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[11]  David A. Medler,et al.  Cerebral Cortex doi:10.1093/cercor/bhi040 Cerebral Cortex Advance Access published February 9, 2005 , 2022 .

[12]  A M Liberman,et al.  Perception of the speech code. , 1967, Psychological review.

[13]  Rutvik H. Desai,et al.  Specialization along the Left Superior Temporal Sulcus for Auditory Categorization , 2010, Cerebral cortex.

[14]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[15]  Steven M. Thurman,et al.  Hierarchy of speech-driven spectrotemporal receptive fields in human auditory cortex , 2018, NeuroImage.

[16]  O. Donchin,et al.  Local field potentials related to bimanual movements in the primary and supplementary motor cortices , 2001, Experimental Brain Research.

[17]  Peter Indefrey,et al.  The Spatial and Temporal Signatures of Word Production Components: A Critical Update , 2011, Front. Psychology.

[18]  Edmund C. Lalor,et al.  Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing , 2015, Current Biology.

[19]  Vinay Jayaram,et al.  Speech-specific tuning of neurons in human superior temporal gyrus. , 2014, Cerebral cortex.

[20]  Mart Bles,et al.  Phonological processing of ignored distractor pictures, an fMRI investigation , 2008, BMC Neuroscience.

[21]  Katie L. McMahon,et al.  Orthographic/Phonological Facilitation of Naming Responses in the Picture–Word Task: An Event-Related fMRI Study Using Overt Vocal Responding , 2002, NeuroImage.

[22]  Willy Serniclaes,et al.  Neural correlates of switching from auditory to speech perception , 2005, NeuroImage.

[23]  R. Jakobson Child Language, Aphasia and Phonological Universals , 1980 .

[24]  I. Fried,et al.  Human intracranial recordings and cognitive neuroscience. , 2012, Annual review of psychology.

[25]  Rafael Malach,et al.  Invariance of firing rate and field potential dynamics to stimulus modulation rate in human auditory cortex , 2011, Human brain mapping.

[26]  D. Whalen The Motor Theory of Speech Perception , 2019, Oxford Research Encyclopedia of Linguistics.

[27]  A. Liberman,et al.  Some Experiments on the Perception of Synthetic Speech Sounds , 1952 .

[28]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[29]  Mikko Sams,et al.  Perceiving identical sounds as speech or non-speech modulates activity in the left posterior superior temporal sulcus , 2006, NeuroImage.

[30]  Gal Chechik,et al.  Metric Learning for Phoneme Perception , 2018, ArXiv.

[31]  K. Stevens,et al.  Feature geometry and the vocal tract , 1994, Phonology.

[32]  Yosef Grodzinsky,et al.  The Neural Code That Makes Us Human , 2014, Science.

[33]  N. Friedmann,et al.  Phonological short-term memory in conduction aphasia , 2012 .

[34]  R. Shepard,et al.  Toward a universal law of generalization for psychological science. , 1987, Science.

[35]  Kristofer E. Bouchard,et al.  Functional Organization of Human Sensorimotor Cortex for Speech Articulation , 2013, Nature.

[36]  George N. Clements,et al.  The geometry of phonological features , 1985, Phonology Yearbook.

[37]  Kenneth N. Stevens,et al.  On the quantal nature of speech , 1972 .

[38]  Jessica S. Arsenault,et al.  Distributed Neural Representations of Phonological Features during Speech Perception , 2015, The Journal of Neuroscience.

[39]  J. Rauschecker,et al.  Phoneme and word recognition in the auditory ventral stream , 2012, Proceedings of the National Academy of Sciences.

[40]  Gregory Hickok,et al.  Neural correlates of word production stages delineated by parametric modulation of psycholinguistic variables , 2009, Human brain mapping.

[41]  E. T. Possing,et al.  Human temporal lobe activation by speech and nonspeech sounds. , 2000, Cerebral cortex.

[42]  C. Wernicke Der aphasische Symptomencomplex: Eine psychologische Studie auf anatomischer Basis , 1874 .

[43]  C. Koch,et al.  Invariant visual representation by single neurons in the human brain , 2005, Nature.

[44]  Brian N. Pasley,et al.  Reconstructing Speech from Human Auditory Cortex , 2012, PLoS biology.

[45]  Itzhak Fried,et al.  Decoding speech perception from single cell activity in humans , 2015, NeuroImage.

[46]  O. Creutzfeldt,et al.  Neuronal activity in the human lateral temporal lobe , 2004, Experimental Brain Research.

[47]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.

[48]  A. Tversky Features of Similarity , 1977 .

[49]  Edward F Chang,et al.  The auditory representation of speech sounds in human motor cortex , 2016, eLife.

[50]  Jean K. Gordon,et al.  A Neural Signature of Phonological Access: Distinguishing the Effects of Word Frequency from Familiarity and Length in Overt Picture Naming , 2007, Journal of Cognitive Neuroscience.

[51]  M. Turvey,et al.  The motor theory of speech perception reviewed , 2006, Psychonomic bulletin & review.

[52]  Matthew K. Leonard,et al.  The Encoding of Speech Sounds in the Superior Temporal Gyrus , 2019, Neuron.

[53]  N. Geschwind The organization of language and the brain. , 1970, Science.

[54]  P. Denes On the Motor Theory of Speech Perception , 1965 .

[55]  M. Halle,et al.  Preliminaries to Speech Analysis: The Distinctive Features and Their Correlates , 1961 .

[56]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[57]  Jayaganesh Swaminathan,et al.  Tracking the dynamic representation of consonants from auditory periphery to cortex. , 2018, The Journal of the Acoustical Society of America.

[58]  R. Quian Quiroga,et al.  Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering , 2004, Neural Computation.

[59]  O. Creutzfeldt,et al.  Neuronal activity in the human lateral temporal lobe , 1989, Experimental Brain Research.

[60]  Edmund T. Rolls,et al.  The neuronal encoding of information in the brain , 2011, Progress in Neurobiology.

[61]  Jeffrey R. Binder,et al.  Left Posterior Temporal Regions are Sensitive to Auditory Categorization , 2008, Journal of Cognitive Neuroscience.