Cortical activity patterns predict robust speech discrimination ability in noise

The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem.

[1]  B C Moore Additivity of simultaneous masking, revisited. , 1985, The Journal of the Acoustical Society of America.

[2]  J H Grose,et al.  Relative contributions of envelope maxima and minima to comodulation masking release. , 1991, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[3]  M. Eckert,et al.  Speech Recognition in Younger and Older Adults: A Dependency on Low-Level Auditory Cortex , 2009, The Journal of Neuroscience.

[4]  Mark S. Seidenberg,et al.  Deficits in perceptual noise exclusion in developmental dyslexia , 2005, Nature Neuroscience.

[5]  T Houtgast,et al.  A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[6]  Brett A. Martin,et al.  Effects of Low-Pass Noise Masking on Auditory Event-Related Potentials to Speech , 2005, Ear and hearing.

[7]  M. D. Wang,et al.  Consonant confusions in noise: a study of perceptual features. , 1973, The Journal of the Acoustical Society of America.

[8]  W. Newsome,et al.  Correlation between Speed Perception and Neural Activity in the Middle Temporal Visual Area , 2005, The Journal of Neuroscience.

[9]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[10]  N. Kraus,et al.  Biological changes in auditory function following training in children with autism spectrum disorders , 2010, Behavioral and Brain Functions.

[11]  K. D. Kryter,et al.  ARTICULATION-TESTING METHODS: CONSONANTAL DIFFERENTIATION WITH A CLOSED-RESPONSE SET. , 1965, The Journal of the Acoustical Society of America.

[12]  Jan W. H. Schnupp,et al.  Plasticity of Temporal Pattern Codes for Vocalization Stimuli in Primary Auditory Cortex , 2006, The Journal of Neuroscience.

[13]  S. Arlinger,et al.  Normal-hearing and hearing-impaired subjects' ability to just follow conversation in competing speech, reversed speech, and noise backgrounds. , 1992, Journal of speech and hearing research.

[14]  H. Dillon,et al.  An international comparison of long‐term average speech spectra , 1994 .

[15]  D. Stapells,et al.  The Effects of Broadband Noise Masking on Cortical Event‐Related Potentials to Speech Sounds /ba/ and /da/ , 1998, Ear and hearing.

[16]  W. Maass,et al.  State-dependent computations: spatiotemporal processing in cortical networks , 2009, Nature Reviews Neuroscience.

[17]  Steven L. Miller,et al.  Temporal Processing Deficits of Language-Learning Impaired Children Ameliorated by Training , 1996, Science.

[18]  Michael S. Landy,et al.  Detection and Discrimination , 1991 .

[19]  Hideki Kawahara,et al.  Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Erika Skoe,et al.  Neural Timing Is Linked to Speech Perception in Noise , 2010, The Journal of Neuroscience.

[21]  Michelle R. Molis,et al.  Cortical Encoding of Signals in Noise: Effects of Stimulus Type and Recording Paradigm , 2010, Ear and hearing.

[22]  T. Parrish,et al.  Cortical mechanisms of speech perception in noise. , 2008, Journal of speech, language, and hearing research : JSLHR.

[23]  S Buus,et al.  Release from masking caused by envelope fluctuations. , 1985, The Journal of the Acoustical Society of America.

[24]  A C Busch,et al.  The Effect of Differing Noise Spectra On the Consistency of Identification of Consonants , 1967, Language and speech.

[25]  Boris Gourévitch,et al.  Neural codes in the thalamocortical auditory system: From artificial stimuli to communication sounds , 2011, Hearing Research.

[26]  Laurel J. Brinton,et al.  The Structure of Modern English: A linguistic introduction , 2000 .

[27]  Mitchell Steinschneider,et al.  Intracortical responses in human and monkey primary auditory cortex support a temporal processing mechanism for encoding of the voice onset time phonetic parameter. , 2004, Cerebral cortex.

[28]  P. Tallal,et al.  Neurobiological Basis of Speech: A Case for the Preeminence of Temporal Processing , 1993, Annals of the New York Academy of Sciences.

[29]  P. Wong,et al.  Aging and cortical mechanisms of speech perception in noise , 2009, Neuropsychologia.

[30]  Lee M. Miller,et al.  A Multisensory Cortical Network for Understanding Speech in Noise , 2009, Journal of Cognitive Neuroscience.

[31]  I. Tetko,et al.  Spatiotemporal activity patterns of rat cortical neurons predict responses in a conditioned task. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[32]  P R Killeen,et al.  Japanese quail can learn phonetic categories. , 1987, Science.

[33]  Andrew Stuart,et al.  Word Recognition in Continuous and Interrupted Broadband Noise by Young Normal‐Hearing, Older Normal‐Hearing, and Presbyacusic Listeners , 1996, Ear and hearing.

[34]  S. S. Stevens,et al.  The Masking of Pure Tones and of Speech by White Noise , 1950 .

[35]  R. Dooling,et al.  Detection and discrimination of natural calls in masking noise by birds: estimating the active space of a signal , 2003, Animal Behaviour.

[36]  N. Kraus,et al.  Neurobiologic responses to speech in noise in children with learning problems: deficits and strategies for improvement , 2001, Clinical Neurophysiology.

[37]  S. H. Hulse,et al.  Auditory scene analysis by songbirds: stream segregation of birdsong by European starlings (Sturnus vulgaris). , 1997, Journal of comparative psychology.

[38]  P Howell,et al.  Segmentation and speech perception in relation to reading skill: a developmental analysis. , 1986, Journal of experimental child psychology.

[39]  Nina Kraus,et al.  Effects of noise and cue enhancement on neural responses to speech in auditory midbrain, thalamus and cortex , 2002, Hearing Research.

[40]  R. Reid,et al.  Temporal Coding of Visual Information in the Thalamus , 2000, The Journal of Neuroscience.

[41]  Virginia Best,et al.  Cortical interference effects in the cocktail party problem , 2007, Nature Neuroscience.

[42]  Chun-I Yeh,et al.  Temporal precision in the neural code and the timescales of natural vision , 2007, Nature.

[43]  J. Kelly,et al.  Organization of auditory cortex in the albino rat: sound frequency. , 1988, Journal of neurophysiology.

[44]  J. Ziegler,et al.  Deficits in speech perception predict language learning impairment. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[45]  H. Levitt,et al.  Predicting consonant confusions from acoustic analysis. , 1981, The Journal of the Acoustical Society of America.

[46]  M M Merzenich,et al.  Neural Mechanisms Underlying Temporal Integration, Segmentation, and Input Sequence Representation: Some Implications for the Origin of Learning Disabilities a , 1993, Annals of the New York Academy of Sciences.

[47]  Stefano Panzeri,et al.  Information Carried by Population Spike Times in the Whisker Sensory Cortex can be Decoded Without Knowledge of Stimulus Time , 2010, Front. Syn. Neurosci..

[48]  A. House,et al.  A masking noise with speech-envelope characteristics for studying intelligibility. , 1971, The Journal of the Acoustical Society of America.

[49]  G. A. Miller,et al.  The Intelligibility of Interrupted Speech , 1948 .

[50]  G. Rosen,et al.  Developmental disruptions and behavioral impairments in rats following in utero RNAi of Dyx1c1 , 2007, Brain Research Bulletin.

[51]  M. Kilgard,et al.  Cortical activity patterns predict speech discrimination ability , 2008, Nature Neuroscience.

[52]  Marcelo A. Montemurro,et al.  Spike-Phase Coding Boosts and Stabilizes Information Carried by Spatial and Temporal Spike Patterns , 2009, Neuron.

[53]  Guglielmo Foffani,et al.  Computational role of large receptive fields in the primary somatosensory cortex. , 2008, Journal of neurophysiology.

[54]  Navzer D. Engineer,et al.  Reversing pathological neural activity using targeted plasticity , 2011, Nature.

[55]  Fei Chen,et al.  Analysis of a simplified normalized covariance measure based on binary weighting functions for predicting the intelligibility of noise-suppressed speech. , 2010, The Journal of the Acoustical Society of America.

[56]  E. Toppila,et al.  The effect of different noise types on the speech and non-speech elicited mismatch negativity , 2005, Hearing Research.

[57]  M. Diamond,et al.  The Role of Spike Timing in the Coding of Stimulus Location in Rat Somatosensory Cortex , 2001, Neuron.

[58]  Maoz Shamir,et al.  Cortical Discrimination of Complex Natural Stimuli: Can Single Neurons Match Behavior? , 2007, The Journal of Neuroscience.

[59]  Michael P. Kilgard,et al.  Cortical Map Plasticity Improves Learning but Is Not Necessary for Improved Performance , 2011, Neuron.

[60]  J. D. Miller,et al.  Speech perception by the chinchilla: voiced-voiceless distinction in alveolar plosive consonants , 1975, Science.

[61]  J. Ziegler,et al.  Speech-perception-in-noise deficits in dyslexia. , 2009, Developmental science.

[62]  Jont B. Allen,et al.  Consonant confusions in white noise. , 2008, The Journal of the Acoustical Society of America.

[63]  Georg M. Klump,et al.  Masking of acoustic signals by the chorus background noise in the green tree frog: A limitation on mate choice , 1988, Animal Behaviour.

[64]  E Ahissar,et al.  Speech comprehension is correlated with temporal response patterns recorded from auditory cortex , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[65]  J. C. Middlebrooks,et al.  Coding of Sound-Source Location by Ensembles of Cortical Neurons , 2000, The Journal of Neuroscience.

[66]  Erika Skoe,et al.  Neural Processing of Speech Sounds in ASD and First-Degree Relatives , 2010, Journal of Autism and Developmental Disorders.

[67]  P. Tallal,et al.  Neurobiology of speech perception. , 1997, Annual review of neuroscience.

[68]  C V Pavlovic,et al.  A frequency importance function for continuous discourse. , 1987, The Journal of the Acoustical Society of America.

[69]  Y. Lacasse,et al.  From the authors , 2005, European Respiratory Journal.

[70]  Guglielmo Foffani,et al.  Role of Spike Timing in the Forelimb Somatosensory Cortex of the Rat , 2004, The Journal of Neuroscience.

[71]  Christoph E. Schreiner,et al.  Representation of CV-sounds in cat primary auditory cortex: intensity dependence , 2003, Speech Commun..

[72]  P. Reed,et al.  Speech perception in rats: use of duration and rise time cues in labeling of affricate/fricative sounds. , 2003, Journal of the experimental analysis of behavior.

[73]  Steven L. Miller,et al.  Language Comprehension in Language-Learning Impaired Children Improved with Acoustically Modified Speech , 1996, Science.

[74]  David A. Medler,et al.  Neural correlates of sensory and decision processes in auditory object identification , 2004, Nature Neuroscience.

[75]  R A Lutfi,et al.  Additivity of simultaneous masking. , 1983, The Journal of the Acoustical Society of America.

[76]  M M Merzenich,et al.  Temporal information transformed into a spatial code by a neural network with realistic properties , 1995, Science.

[77]  M Steinschneider,et al.  Temporal encoding of the voice onset time phonetic parameter by field potentials recorded directly from human auditory cortex. , 1999, Journal of neurophysiology.

[78]  Stephen Cox,et al.  Modelling of confusions in aircraft call-signs , 2004, Speech Commun..

[79]  Alexandra Muller-Gass,et al.  The intensity of masking noise affects the mismatch negativity to speech sounds in human subjects , 2001, Neuroscience Letters.

[80]  H. Sompolinsky,et al.  The tempotron: a neuron that learns spike timing–based decisions , 2006, Nature Neuroscience.

[81]  F. Ramus,et al.  Language discrimination by human newborns and by cotton-top tamarin monkeys. , 2000, Science.

[82]  R. Frisina,et al.  PET imaging of the normal human auditory system: responses to speech in quiet and in background noise , 2002, Hearing Research.

[83]  Timothy Q. Gentner,et al.  The effect of auditory distractors on song discrimination in male canaries (Serinus canaria) , 2005, Behavioural Processes.

[84]  B. Richmond,et al.  Latency: another potential code for feature binding in striate cortex. , 1996, Journal of neurophysiology.

[85]  Nina Kraus,et al.  Brainstem responses to speech syllables , 2004, Clinical Neurophysiology.

[86]  D. Kurtzberg,et al.  The effects of decreased audibility produced by high-pass noise masking on N1 and the mismatch negativity to speech sounds /ba/and/da. , 1999, Journal of speech, language, and hearing research : JSLHR.

[87]  Günter Ehret,et al.  Auditory masking and effects of noise on responses of the green treefrog (Hyla cinerea) to synthetic mating calls , 1980, Journal of comparative physiology.

[88]  Erika Skoe,et al.  Perception of Speech in Noise: Neural Correlates , 2011, Journal of Cognitive Neuroscience.

[89]  Nima Mesgarani,et al.  Phoneme representation and classification in primary auditory cortex. , 2008, The Journal of the Acoustical Society of America.

[90]  Eric D Young,et al.  First-spike latency information in single neurons increases when referenced to population onset , 2007, Proceedings of the National Academy of Sciences.

[91]  Herman J. M. Steeneken,et al.  Phoneme-group specific octave-band weights in predicting speech intelligibility , 2002, Speech Commun..

[92]  E. Vaadia,et al.  Spatiotemporal firing patterns in the frontal cortex of behaving monkeys. , 1993, Journal of neurophysiology.

[93]  Ranulfo Romo,et al.  Neural codes for perceptual discrimination of acoustic flutter in the primate auditory cortex , 2009, Proceedings of the National Academy of Sciences.

[94]  M. Dorman,et al.  Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. , 1997, The Journal of the Acoustical Society of America.

[95]  Jont B. Allen,et al.  Consonant and vowel confusions in speech-weighted noise , 2007, INTERSPEECH.