Changes in Neuronal Representations of Consonants in the Ascending Auditory System and Their Role in Speech Recognition

A fundamental task of the ascending auditory system is to produce representations that facilitate the recognition of complex sounds. This is particularly challenging in the context of acoustic variability, such as that between different talkers producing the same phoneme. These representations are transformed as information is propagated throughout the ascending auditory system from the inner ear to the auditory cortex (AI). Investigating these transformations and their role in speech recognition is key to understanding hearing impairment and the development of future clinical interventions. Here, we obtained neural responses to an extensive set of natural vowel-consonant-vowel phoneme sequences, each produced by multiple talkers, in three stages of the auditory processing pathway. Auditory nerve (AN) representations were simulated using a model of the peripheral auditory system and extracellular neuronal activity was recorded in the inferior colliculus (IC) and primary auditory cortex (AI) of anaesthetized guinea pigs. A classifier was developed to examine the efficacy of these representations for recognizing the speech sounds. Individual neurons convey progressively less information from AN to AI. Nonetheless, at the population level, representations are sufficiently rich to facilitate recognition of consonants with a high degree of accuracy at all stages indicating a progression from a dense, redundant representation to a sparse, distributed one. We examined the timescale of the neural code for consonant recognition and found that optimal timescales increase throughout the ascending auditory system from a few milliseconds in the periphery to several tens of milliseconds in the cortex. Despite these longer timescales, we found little evidence to suggest that representations up to the level of AI become increasingly invariant to across-talker differences. Instead, our results support the idea that the role of the subcortical auditory system is one of dimensionality expansion, which could provide a basis for flexible classification of arbitrary speech sounds.

[1]  P. Kuhl Early language acquisition: cracking the speech code , 2004, Nature Reviews Neuroscience.

[2]  A R Palmer,et al.  Rate-intensity functions and their modification by broadband noise for neurons in the guinea pig inferior colliculus. , 1988, The Journal of the Acoustical Society of America.

[3]  E T Rolls,et al.  Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. , 1995, Journal of neurophysiology.

[4]  Manuel R. Mercier,et al.  Mapping phonemic processing zones along human perisylvian cortex: an electro-corticographic investigation , 2013, Brain Structure and Function.

[5]  R. R. Capranica,et al.  A comparison of anesthetic agents and their effects on the response properties of the peripheral auditory system , 1992, Hearing Research.

[6]  M. Kilgard,et al.  Cortical activity patterns predict robust speech discrimination ability in noise , 2011, The European journal of neuroscience.

[7]  R. Quian Quiroga,et al.  Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering , 2004, Neural Computation.

[8]  E. Lopez-Poveda,et al.  A computational algorithm for computing nonlinear auditory frequency selectivity. , 2001, The Journal of the Acoustical Society of America.

[9]  R D Hienz,et al.  The acquisition of vowel discriminations by nonhuman primates. , 1988, The Journal of the Acoustical Society of America.

[10]  A. Palmer,et al.  Phase-locking in the cochlear nerve of the guinea-pig and its relation to the receptor potential of inner hair-cells , 1986, Hearing Research.

[11]  D. P. Phillips,et al.  Central auditory onset responses, and temporal asymmetries in auditory perception , 2002, Hearing Research.

[12]  M. Kilgard,et al.  Different timescales for the neural coding of consonant and vowel sounds. , 2013, Cerebral cortex.

[13]  J. Eggermont Representation of a voice onset time continuum in primary auditory cortex of the cat. , 1995, The Journal of the Acoustical Society of America.

[14]  O. Zobay,et al.  Classification of frequency response areas in the inferior colliculus reveals continua not discrete classes , 2013, The Journal of physiology.

[15]  Ray Meddis,et al.  Adaptation in a revised inner-hair cell model. , 2003, The Journal of the Acoustical Society of America.

[16]  Xiaoqin Wang,et al.  Cortical Coding of Auditory Features. , 2018, Annual review of neuroscience.

[17]  Srivatsun Sadagopan,et al.  Nonlinear Spectrotemporal Interactions Underlying Selectivity for Complex Sounds in Auditory Cortex , 2009, The Journal of Neuroscience.

[18]  C. Schroeder,et al.  Physiologic Correlates of the Voice Onset Time Boundary in Primary Auditory Cortex (A1) of the Awake Monkey: Temporal Response Patterns , 1995, Brain and Language.

[19]  Ray Meddis,et al.  A revised model of the inner-hair cell and auditory-nerve complex. , 2002, The Journal of the Acoustical Society of America.

[20]  J. Arezzo,et al.  Representation of the voice onset time (VOT) speech parameter in population responses within primary auditory cortex of the awake monkey. , 2003, The Journal of the Acoustical Society of America.

[21]  J. Fritz,et al.  Dynamics of Precise Spike Timing in Primary Auditory Cortex , 2004, The Journal of Neuroscience.

[22]  Yu Sato,et al.  Neural Responses in the Primary Auditory Cortex of Freely Behaving Cats While Discriminating Fast and Slow Click-Trains , 2011, PloS one.

[23]  D. Bendor,et al.  Neural coding of temporal information in auditory thalamus and cortex , 2008, Neuroscience.

[24]  Kirill V. Nourski,et al.  Representation of speech in human auditory cortex: Is it special? , 2013, Hearing Research.

[25]  Bruno A Olshausen,et al.  Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[26]  D. Sinex,et al.  Effects of interaural time differences on the responses of chinchilla inferior colliculus neurons to consonant-vowel syllables 1 1 Portions of the results have been reported in abstract form (Chen and Sinex, 1996). , 1999, Hearing Research.

[27]  M. Kilgard,et al.  Cortical speech-evoked response patterns in multiple auditory fields are correlated with behavioral discrimination ability. , 2013, Journal of neurophysiology.

[28]  D. Sinex,et al.  Neural responses to the onset of voicing are unrelated to other measures of temporal resolution. , 1997, The Journal of the Acoustical Society of America.

[29]  D. Tolhurst,et al.  Characterizing the sparseness of neural codes , 2001, Network.

[30]  R. Ilmoniemi,et al.  Language-specific phoneme representations revealed by electric and magnetic brain responses , 1997, Nature.

[31]  D. Sinex,et al.  Average discharge rate representation of voice onset time in the chinchilla auditory nerve. , 1988, The Journal of the Acoustical Society of America.

[32]  M. Sachs,et al.  Representation of stop consonants in the discharge patterns of auditory-nerve fibers. , 1983, The Journal of the Acoustical Society of America.

[33]  J. Eggermont,et al.  Effects of Noise-Induced Hearing Loss at Young Age on Voice Onset Time and Gap-in-Noise Representations in Adult Cat Primary Auditory Cortex , 2006, Journal of the Association for Research in Otolaryngology.

[34]  Jan W. H. Schnupp,et al.  Plasticity of Temporal Pattern Codes for Vocalization Stimuli in Primary Auditory Cortex , 2006, The Journal of Neuroscience.

[35]  P. Kuhl Speech perception in early infancy: perceptual constancy for spectrally dissimilar vowel categories. , 1979, The Journal of the Acoustical Society of America.

[36]  R V Shannon,et al.  Consonant recordings for speech testing. , 1999, The Journal of the Acoustical Society of America.

[37]  M. Semple,et al.  Transformation of Temporal Properties between Auditory Midbrain and Cortex in the Awake Mongolian Gerbil , 2007, The Journal of Neuroscience.

[38]  Xiaoqin Wang,et al.  Sustained firing in auditory cortex evoked by preferred stimuli , 2005, Nature.

[39]  Kerry M. M. Walker,et al.  Spectral timbre perception in ferrets: discrimination of artificial vowels under different listening conditions. , 2013, The Journal of the Acoustical Society of America.

[40]  R D Hienz,et al.  Vowel discrimination in cats: acquisition, effects of stimulus level, and performance in noise. , 1996, The Journal of the Acoustical Society of America.

[41]  Timothy Q Gentner,et al.  Central auditory neurons have composite receptive fields , 2016, Proceedings of the National Academy of Sciences.

[42]  D. Sinex,et al.  Synchronized discharge rate representation of voice-onset time in the chinchilla auditory nerve. , 1989, The Journal of the Acoustical Society of America.

[43]  N. Logothetis,et al.  Millisecond encoding precision of auditory cortex neurons , 2010, Proceedings of the National Academy of Sciences.

[44]  Q. Summerfield Articulatory rate and perceptual constancy in phonetic perception. , 1981, Journal of experimental psychology. Human perception and performance.

[45]  Jean-Marc Edeline,et al.  A Spike-Timing Code for Discriminating Conspecific Vocalizations in the Thalamocortical System of Anesthetized and Awake Guinea Pigs , 2009, The Journal of Neuroscience.

[46]  Alan R. Palmer,et al.  Identification and localisation of auditory areas in guinea pig cortex , 2000, Experimental Brain Research.

[47]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[48]  Marcelo A. Montemurro,et al.  Spike-Phase Coding Boosts and Stabilizes Information Carried by Spatial and Temporal Spike Patterns , 2009, Neuron.

[49]  C D Geisler,et al.  Responses of auditory-nerve fibers to consonant-vowel syllables. , 1981, The Journal of the Acoustical Society of America.

[50]  John P. Miller,et al.  Temporal encoding in nervous systems: A rigorous definition , 1995, Journal of Computational Neuroscience.

[51]  J L Gallant,et al.  Sparse coding and decorrelation in primary visual cortex during natural vision. , 2000, Science.

[52]  T. Hromádka,et al.  Sparse Representation of Sounds in the Unanesthetized Auditory Cortex , 2008, PLoS biology.

[53]  Neil C. Rabinowitz,et al.  Constructing Noise-Invariant Representations of Sound in the Auditory Pathway , 2013, PLoS biology.

[54]  Piotr Majdak,et al.  The Auditory Modeling Toolbox , 2013 .

[55]  S. Dehaene,et al.  Functional Neuroimaging of Speech Perception in Infants , 2002, Science.

[56]  Neil A. Macmillan,et al.  Detection Theory: A User's Guide , 1991 .

[57]  Ray Meddis,et al.  The temporal representation of speech in a nonlinear model of the guinea pig cochlea. , 2004, The Journal of the Acoustical Society of America.

[58]  Ray Meddis,et al.  A nonlinear filter-bank model of the guinea-pig cochlear nerve: rate responses. , 2003, The Journal of the Acoustical Society of America.

[59]  Xiaoqin Wang,et al.  Temporal and rate representations of time-varying signals in the auditory cortex of awake primates , 2001, Nature Neuroscience.

[60]  L H Carney,et al.  A temporal analysis of auditory-nerve fiber responses to spoken stop consonant-vowel syllables. , 1986, The Journal of the Acoustical Society of America.

[61]  M. Kilgard,et al.  Detection and identification of speech sounds using cortical activity patterns , 2014, Neuroscience.

[62]  Michael Wehr,et al.  A Coding Transformation for Temporally Structured Sounds within Auditory Cortical Neurons , 2015, Neuron.

[63]  Nicolas Brunel,et al.  Sensory neural codes using multiplexed temporal scales , 2010, Trends in Neurosciences.

[64]  Gal Chechik,et al.  Reduction of Information Redundancy in the Ascending Auditory Pathway , 2006, Neuron.

[65]  Rajesh P. N. Rao,et al.  Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[66]  Sarah M N Woolley,et al.  Anesthetic state modulates excitability but not spectral tuning or neural discrimination in single auditory midbrain neurons. , 2011, Journal of neurophysiology.

[67]  Nima Mesgarani,et al.  Phoneme representation and classification in primary auditory cortex. , 2008, The Journal of the Acoustical Society of America.

[68]  M. Kilgard,et al.  Cortical activity patterns predict speech discrimination ability , 2008, Nature Neuroscience.

[69]  P. Kuhl Discrimination of speech by nonhuman animals: Basic auditory sensitivities conducive to the perception of speech‐sound categories , 1981 .

[70]  J. D. Miller,et al.  Speech perception by the chinchilla: discrimination of sustained /a/ and /i/. , 1975, The Journal of the Acoustical Society of America.

[71]  Jose A Garcia-Lazaro,et al.  Independent Population Coding of Speech with Sub-Millisecond Precision , 2013, The Journal of Neuroscience.

[72]  C. Schroeder,et al.  Speech-evoked activity in primary auditory cortex: effects of voice onset time. , 1994, Electroencephalography and clinical neurophysiology.

[73]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[74]  Barak A. Pearlmutter,et al.  Sparse Representations for the Cocktail Party Problem , 2006, The Journal of Neuroscience.