Modulation of the primary auditory thalamus when recognising speech in noise

Recognising speech in background noise is a strenuous daily activity, yet most humans can master it. A mechanistic explanation of how the human brain deals with such sensory uncertainty is the Bayesian Brain Hypothesis. In this view, the brain uses a dynamic generative model to simulate the most likely trajectory of the speech signal. Such simulation account can explain why there is a task-dependent modulation of sensory pathway structures (i.e., the sensory thalami) for recognition tasks that require tracking of fast-varying stimulus properties (i.e., speech) in contrast to relatively constant stimulus properties (e.g., speaker identity) despite the same stimulus input. Here we test the specific hypothesis that this task-dependent modulation for speech recognition increases in parallel with the sensory uncertainty in the speech signal. In accordance with this hypothesis, we show—by using ultra-high-resolution functional magnetic resonance imaging in human participants—that the task-dependent modulation of the left primary sensory thalamus (ventral medial geniculate body, vMGB) for speech is particularly strong when recognizing speech in noisy listening conditions in contrast to situations where the speech signal is clear. Exploratory analyses showed that this finding was specific to the left vMGB; it was not present in the midbrain structure of the auditory pathway (left inferior colliculus, IC). The results imply that speech in noise recognition is supported by modifications at the level of the subcortical sensory pathway providing driving input to the auditory cortex.

[1]  B. Uttl,et al.  Measurement of Individual Differences , 2005, Psychological science.

[2]  Xiao-Hua Zhou,et al.  Statistical Methods for Meta‐Analysis , 2008 .

[3]  Karl J. Friston,et al.  Modelling Geometric Deformations in Epi Time Series , 2022 .

[4]  Tobias U. Hauser,et al.  The PhysIO Toolbox for Modeling Physiological Noise in fMRI Data , 2017, Journal of Neuroscience Methods.

[5]  Cathy J. Price,et al.  A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading , 2012, NeuroImage.

[6]  Angela J. Yu,et al.  Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[7]  Mark W. Woolrich,et al.  Advances in functional and structural MR image analysis and implementation as FSL , 2004, NeuroImage.

[8]  J. Ziegler,et al.  Speech-perception-in-noise deficits in dyslexia. , 2009, Developmental science.

[9]  R. W. Hukin,et al.  Effectiveness of spatial cues, prosody, and talker characteristics in selective attention. , 2000, The Journal of the Acoustical Society of America.

[10]  Karl J. Friston,et al.  Predictive Coding or Evidence Accumulation? False Inference and Neuronal Fluctuations , 2010, PloS one.

[11]  Rajesh P. N. Rao,et al.  Predictive Coding , 2019, A Blueprint for the Hard Problem of Consciousness.

[12]  Erika Skoe,et al.  Perception of Speech in Noise: Neural Correlates , 2011, Journal of Cognitive Neuroscience.

[13]  Karl J. Friston,et al.  Predictive coding under the free-energy principle , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[14]  Karl J. Friston,et al.  Canonical Microcircuits for Predictive Coding , 2012, Neuron.

[15]  Roy D. Patterson,et al.  The role of glottal pulse rate and vocal tract length in the perception of speaker identity , 2009, INTERSPEECH.

[16]  Douglas G Altman,et al.  Correlation in restricted ranges of data , 2011, BMJ : British Medical Journal.

[17]  Virginia Best,et al.  Binaural interference and auditory grouping. , 2007, The Journal of the Acoustical Society of America.

[18]  Lee M. Miller,et al.  A Multisensory Cortical Network for Understanding Speech in Noise , 2009, Journal of Cognitive Neuroscience.

[19]  G. Kramer Auditory Scene Analysis: The Perceptual Organization of Sound by Albert Bregman (review) , 2016 .

[20]  John J. Chen Communicating complex information: the interpretation of statistical interaction in multiple logistic regression analysis. , 2003, American journal of public health.

[21]  E. C. Cherry Some Experiments on the Recognition of Speech, with One and with Two Ears , 1953 .

[22]  Stuart Rosen,et al.  A positron emission tomography study of the neural basis of informational and energetic masking effects in speech perception. , 2004, The Journal of the Acoustical Society of America.

[23]  Robert Trampel,et al.  Modulation of tonotopic ventral medial geniculate body is behaviorally relevant for speech recognition , 2019, eLife.

[24]  Matthew H. Davis,et al.  Speech recognition in adverse conditions: A review , 2012 .

[25]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[26]  J. Buitelaar,et al.  Intact Spectral but Abnormal Temporal Processing of Auditory Stimuli in Autism , 2009, Journal of autism and developmental disorders.

[27]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[28]  P. Loizou,et al.  The influence of noise on vowel and consonant cues. , 2005, The Journal of the Acoustical Society of America.

[29]  Brian B. Avants,et al.  Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain , 2008, Medical Image Anal..

[30]  T. Parrish,et al.  Cortical mechanisms of speech perception in noise. , 2008, Journal of speech, language, and hearing research : JSLHR.

[31]  K. Kriegstein,et al.  Brief Report: Speech-in-Noise Recognition and the Relation to Vocal Pitch Perception in Adults with Autism Spectrum Disorder and Typical Development , 2020, Journal of autism and developmental disorders.

[32]  S. Sherman,et al.  Evidence for nonreciprocal organization of the mouse auditory thalamocortical‐corticothalamic projection systems , 2008, The Journal of comparative neurology.

[33]  Richard S. J. Frackowiak,et al.  Representation of the temporal envelope of sounds in the human brain. , 2000, Journal of neurophysiology.

[34]  S. Kiebel,et al.  Dysfunction of the auditory thalamus in developmental dyslexia , 2012, Proceedings of the National Academy of Sciences.

[35]  John Salvatier,et al.  Probabilistic programming in Python using PyMC3 , 2016, PeerJ Comput. Sci..

[36]  A. Sillito,et al.  Always returning: feedback and sensory processing in visual cortex and thalamus , 2006, Trends in Neurosciences.

[37]  S. Baron-Cohen,et al.  The Autism-Spectrum Quotient (AQ): Evidence from Asperger Syndrome/High-Functioning Autism, Malesand Females, Scientists and Mathematicians , 2001, Journal of autism and developmental disorders.

[38]  D. Bendor,et al.  Neural coding of temporal information in auditory thalamus and cortex , 2008, Neuroscience.

[39]  Etienne Gaudrain,et al.  A neural mechanism for recognizing speech spoken by different speakers , 2014, NeuroImage.

[40]  N. Kraus,et al.  The scalp-recorded brainstem response to speech: neural origins and plasticity. , 2010, Psychophysiology.

[41]  Andrew A. Anderson,et al.  Assessing Statistical Results: Magnitude, Precision, and Model Uncertainty , 2019, The American Statistician.

[42]  Tobias Kober,et al.  MP2RAGE, a self bias-field corrected sequence for improved segmentation and T1-mapping at high field , 2010, NeuroImage.

[43]  Nina Kraus,et al.  Musical training during early childhood enhances the neural encoding of speech in noise , 2012, Brain and Language.

[44]  J. Saffran Statistical Language Learning , 2003 .

[45]  Tobias Reichenbach,et al.  The human auditory brainstem response to running speech reveals a subcortical mechanism for selective attention , 2017, bioRxiv.

[46]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[47]  Hideki Kawahara,et al.  Implementation of realtime STRAIGHT speech manipulation system: Report on its first implementation , 2007 .

[48]  B. Shinn-Cunningham,et al.  Selective Attention in Normal and Impaired Hearing , 2008, Trends in amplification.

[49]  P. Jezzard,et al.  Correction for geometric distortion in echo planar images from B0 field variations , 1995, Magnetic resonance in medicine.

[50]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[51]  S. G. Nooteboom,et al.  Intonation and the perceptual separation of simultaneous voices , 1982 .

[52]  S. Laughlin,et al.  Predictive coding: a fresh view of inhibition in the retina , 1982, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[53]  C. Escera,et al.  Involvement of the Serotonin Transporter Gene in Accurate Subcortical Speech Encoding , 2016, The Journal of Neuroscience.

[54]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[55]  B. Moore,et al.  Thresholds for the detection of inharmonicity in complex tones. , 1985, The Journal of the Acoustical Society of America.

[56]  Sander Greenland,et al.  Scientists rise up against statistical significance , 2019, Nature.

[57]  J. Wouters,et al.  Auditory processing, speech perception and phonological ability in pre-school children at high-risk for dyslexia: A longitudinal study of the auditory temporal processing theory , 2007, Neuropsychologia.

[58]  T. Bellis,et al.  Central auditory processing disorders in children and adults. , 2015, Handbook of clinical neurology.

[59]  A. Brechmann,et al.  A European Perspective on Auditory Processing Disorder-Current Knowledge and Future Research Focus , 2017, Front. Neurol..

[60]  C. Lam,et al.  Musician Enhancement for Speech-In-Noise , 2009, Ear and hearing.

[61]  M. Denckla,et al.  Rapid ‘automatized’ naming (R.A.N.): Dyslexia differentiated from other learning disabilities , 1976, Neuropsychologia.

[62]  R. Frisina,et al.  PET imaging of the normal human auditory system: responses to speech in quiet and in background noise , 2002, Hearing Research.

[63]  Richard McElreath,et al.  Statistical Rethinking: A Bayesian Course with Examples in R and Stan , 2015 .

[64]  N. Kraus,et al.  An Integrative Model of Subcortical Auditory Plasticity , 2013, Brain Topography.

[65]  Karl J. Friston,et al.  Predictions not commands: active inference in the motor system , 2012, Brain Structure and Function.

[66]  N. Kraus,et al.  Music, Noise-Exclusion, and Learning , 2010 .

[67]  A. Bregman,et al.  Demonstrations of auditory scene analysis : the perceptual organization of sound , 1995 .

[68]  I. Winter,et al.  Ambiguous Pitch and the Temporal Representation of Inharmonic Iterated Rippled Noise in the Ventral Cochlear Nucleus , 2008, The Journal of Neuroscience.

[69]  J. Peelle Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior , 2017, Ear and hearing.

[70]  Karl J. Friston,et al.  Active interoceptive inference and the emotional brain , 2016, Philosophical Transactions of the Royal Society B: Biological Sciences.

[71]  Naotsugu Tsuchiya,et al.  Neural markers of predictive coding under perceptual uncertainty revealed with Hierarchical Frequency Tagging , 2017, eLife.

[72]  Karl J. Friston,et al.  A Hierarchy of Time-Scales and the Brain , 2008, PLoS Comput. Biol..

[73]  P. Bolton,et al.  Speech-in-noise perception in high-functioning individuals with autism or Asperger's syndrome. , 2004, Journal of child psychology and psychiatry, and allied disciplines.

[74]  Peter Elias,et al.  Predictive coding-I , 1955, IRE Trans. Inf. Theory.

[75]  Yanmin Qian,et al.  Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[76]  Kathryn P. Guy,et al.  Rapid Naming Deficits in Children and Adolescents with Reading Disabilities and Attention Deficit Hyperactivity Disorder , 2000, Brain and Language.

[77]  Charles C Lee Thalamic and cortical pathways supporting auditory processing , 2013, Brain and Language.

[78]  J. Wagemans,et al.  Precise minds in uncertain worlds: predictive coding in autism. , 2014, Psychological review.

[79]  C. Pine A European perspective. , 1996, Community dental health.

[80]  Nina Kraus,et al.  Sensory-cognitive interaction in the neural encoding of speech in noise: a review. , 2010, Journal of the American Academy of Audiology.

[81]  Karl J. Friston,et al.  Attention, Uncertainty, and Free-Energy , 2010, Front. Hum. Neurosci..

[82]  Anders M. Dale,et al.  Sequence-independent segmentation of magnetic resonance images , 2004, NeuroImage.

[83]  N. Kraus,et al.  Context-Dependent Encoding in the Human Auditory Brainstem Relates to Hearing Speech in Noise: Implications for Developmental Dyslexia , 2009, Neuron.

[84]  M. Nicolelis,et al.  Immediate thalamic sensory plasticity depends on corticothalamic feedback. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[85]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[86]  J. Goodman,et al.  Perceptual masking of spondees by combinations of talkers , 1975 .

[87]  David Mumford,et al.  On the computational architecture of the neocortex , 2004, Biological Cybernetics.

[88]  Karl J. Friston,et al.  Reflections on agranular architecture: predictive coding in the motor cortex , 2013, Trends in Neurosciences.

[89]  Odette Scharenborg,et al.  Reaching over the gap: A review of efforts to link human and automatic speech recognition research , 2007, Speech Commun..

[90]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[91]  A. Sillito,et al.  Focal Gain Control of Thalamic Visual Receptive Fields by Layer 6 Corticothalamic Feedback , 2016, Cerebral cortex.

[92]  Satrajit S. Ghosh,et al.  Nipype: A Flexible, Lightweight and Extensible Neuroimaging Data Processing Framework in Python , 2011, Front. Neuroinform..

[93]  George L. Gerstein,et al.  Feature-linked synchronization of thalamic relay cell firing induced by feedback from the visual cortex , 1994, Nature.

[94]  A. Anwander,et al.  Altered Structural Connectivity of the Left Visual Thalamus in Developmental Dyslexia , 2017, Current Biology.

[95]  L. Williams,et al.  Contents , 2020, Ophthalmology (Rochester, Minn.).

[96]  E. Rouiller,et al.  Origin of afferents to physiologically defined regions of the medial geniculate body of the cat: ventral and dorsal divisions , 1985, Hearing Research.

[97]  R. Patterson,et al.  Task-Dependent Modulation of Medial Geniculate Body Is Behaviorally Relevant for Speech Recognition , 2008, Current Biology.

[98]  P. Wong,et al.  Aging and cortical mechanisms of speech perception in noise , 2009, Neuropsychologia.

[99]  Nadja Tschentscher,et al.  Reduced structural connectivity between left auditory thalamus and the motion-sensitive planum temporale in developmental dyslexia , 2018, 1811.11658.

[100]  A. Bronkhorst The cocktail-party problem revisited: early processing and selection of multi-talker speech , 2015, Attention, Perception, & Psychophysics.

[101]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[102]  Jonas Obleser,et al.  Modular reconfiguration of an auditory control brain network supports adaptive listening behavior , 2018, Proceedings of the National Academy of Sciences.

[103]  R. Fisher FREQUENCY DISTRIBUTION OF THE VALUES OF THE CORRELATION COEFFIENTS IN SAMPLES FROM AN INDEFINITELY LARGE POPU;ATION , 1915 .

[104]  Xiao Han,et al.  Atlas Renormalization for Improved Brain MR Image Segmentation Across Scanner Platforms , 2007, IEEE Transactions on Medical Imaging.

[105]  Patti Adank,et al.  The neural bases of difficult speech comprehension and speech production: Two Activation Likelihood Estimation (ALE) meta-analyses , 2012, Brain and Language.

[106]  Steffen L. Lauritzen,et al.  Independence properties of directed markov fields , 1990, Networks.

[107]  Gavin M. Bidelman,et al.  Subcortical sources dominate the neuroelectric auditory frequency-following response to speech , 2018, NeuroImage.

[108]  Dorota Kurowicka,et al.  Generating random correlation matrices based on vines and extended onion method , 2009, J. Multivar. Anal..

[109]  Matthew H. Davis,et al.  Hearing speech sounds: Top-down influences on the interface between audition and speech perception , 2007, Hearing Research.

[110]  Robin M Heidemann,et al.  Generalized autocalibrating partially parallel acquisitions (GRAPPA) , 2002, Magnetic resonance in medicine.

[111]  D. Poeppel,et al.  The cortical organization of speech processing , 2007, Nature Reviews Neuroscience.

[112]  S. Rosen,et al.  The Role of Age-Related Declines in Subcortical Auditory Processing in Speech Perception in Noise , 2016, Journal of the Association for Research in Otolaryngology.

[113]  R. Mcelreath,et al.  Interethnic Interaction, Strategic Bargaining Power, and the Dynamics of Cultural Norms , 2017, Human nature.

[114]  Avinash G. Keskar,et al.  An efficient noise-robust automatic speech recognition system using artificial neural networks , 2016, 2016 International Conference on Communication and Signal Processing (ICCSP).