Linguistic constraints modulate speech timing in an oscillating neural network

Neuronal oscillations putatively track speech in order to optimize sensory processing. However, it is unclear how isochronous brain oscillations can track pseudo-rhythmic speech input. Here we investigate how top-down predictions flowing from internal language models interact with oscillations during speech processing. We show that word-to-word onset delays are shorter when words are spoken in predictable contexts. A computational model including oscillations, feedback, and inhibition is able to track the natural pseudo-rhythmic word-to-word onset differences. As the model processes, it generates temporal phase codes, which are a candidate mechanism for carrying information forward in time in the system. Intriguingly, the model’s response is more rhythmic for non-isochronous compared to isochronous speech when onset times are proportional to predictions from the internal model. These results show that oscillatory tracking of temporal speech dynamics relies not only on the input acoustics, but also on the linguistic constraints flowing from knowledge of language.

[1]  David Poeppel,et al.  The analysis of speech in different temporal integration windows: cerebral lateralization as 'asymmetric sampling in time' , 2003, Speech Commun..

[2]  Joachim Gross,et al.  Perceptually relevant speech tracking in auditory and motor cortex reflects distinct linguistic features , 2018, PLoS biology.

[3]  Martin Cooke,et al.  Talkers produce more pronounced amplitude modulations when speaking in noise. , 2018, The Journal of the Acoustical Society of America.

[4]  W. Marslen-Wilson Functional parallelism in spoken word-recognition , 1987, Cognition.

[5]  G. Beattie,et al.  Contextual Probability and Word Frequency as Determinants of Pauses and Errors in Spontaneous Speech , 1979 .

[6]  Andrea E Martin,et al.  Predicate learning in neural systems: using oscillations to discover latent structure , 2019, Current Opinion in Behavioral Sciences.

[7]  J. Lisman The theta/gamma discrete phase code occuring during the hippocampal phase precession may be a more general brain coding scheme , 2005, Hippocampus.

[8]  M. R. Mehta,et al.  Role of experience and oscillations in transforming a rate code into a temporal code , 2002, Nature.

[9]  Matthew H. Davis,et al.  Neural Oscillations Carry Speech Rhythm through to Comprehension , 2012, Front. Psychology.

[10]  Eva Reinisch,et al.  The uptake of spectral and temporal cues in vowel perception is rapidly influenced by context , 2013, J. Phonetics.

[11]  A. Friederici The brain basis of language processing: from structure to function. , 2011, Physiological reviews.

[12]  Björn Herrmann,et al.  Oscillatory Phase Dynamics in Neural Entrainment Underpin Illusory Percepts of Time , 2013, The Journal of Neuroscience.

[13]  François Christophe Egidio Pellegrino,et al.  Across-Language Perspective on Speech Information Rate , 2011 .

[14]  David Poeppel,et al.  Cortical oscillations and speech processing: emerging computational principles and operations , 2012, Nature Neuroscience.

[15]  Leonidas A A Doumas,et al.  A theory of the discovery and predication of relational concepts. , 2008, Psychological review.

[16]  J. Lisman,et al.  Serial representation of items during working memory maintenance at letter-selective cortical sites , 2017, bioRxiv.

[17]  R. VanRullen,et al.  An oscillatory mechanism for prioritizing salient unattended stimuli , 2012, Trends in Cognitive Sciences.

[18]  Sarah Hawkins,et al.  Situational influences on rhythmicity in speech, music, and their interaction , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[19]  Lars Meyer,et al.  The neural oscillations of speech processing and language comprehension: state of the art and emerging mechanisms , 2018, The European journal of neuroscience.

[20]  J. Obleser,et al.  Frequency modulation entrains slow neural oscillations and optimizes human listening behavior , 2012, Proceedings of the National Academy of Sciences.

[21]  Peter De Weerd,et al.  Learned interval time facilitates associate memory retrieval. , 2017, Learning & memory.

[22]  Marcelo A. Montemurro,et al.  Spike-Phase Coding Boosts and Stabilizes Information Carried by Spatial and Temporal Spike Patterns , 2009, Neuron.

[23]  Atsuko Takashima,et al.  Neural Entrainment Determines the Words We Hear , 2017, Current Biology.

[24]  Mante S. Nieuwland,et al.  Do ‘early’ brain responses reveal word form prediction during language comprehension? A critical review , 2019, Neuroscience & Biobehavioral Reviews.

[25]  Christoph Kayser,et al.  Prestimulus influences on auditory perception from sensory representations and decision processes , 2016, Proceedings of the National Academy of Sciences.

[26]  Peter Hagoort,et al.  The core and beyond in the language-ready brain , 2017, Neuroscience & Biobehavioral Reviews.

[27]  Eva Reinisch,et al.  Normalization for speechrate in native and nonnative speech , 2015 .

[28]  Akihiro Yagi,et al.  Reduction of stimulus visibility compresses apparent time intervals , 2008, Nature Neuroscience.

[29]  Karl J. Friston,et al.  Canonical Microcircuits for Predictive Coding , 2012, Neuron.

[30]  Anne Fernald,et al.  Speech to Infants as Hyperspeech: Knowledge-Driven Processes in Early Word Recognition , 2000, Phonetica.

[31]  Ellen F. Lau,et al.  A cortical network for semantics: (de)constructing the N400 , 2008, Nature Reviews Neuroscience.

[32]  Oded Ghitza,et al.  The theta-syllable: a unit of speech information defined by cortical function , 2013, Front. Psychol..

[33]  D. Cumin,et al.  Generalising the Kuramoto Model for the study of Neuronal Synchronisation in the Brain , 2007 .

[34]  Anne Kösem,et al.  An Entrained Rhythm's Frequency, Not Phase, Influences Temporal Sampling of Speech , 2017, INTERSPEECH.

[35]  D. Eagleman,et al.  The Effect of Predictability on Subjective Duration , 2007, PloS one.

[36]  Lars Meyer,et al.  Synchronous, but not entrained: exogenous and endogenous cortical rhythms of speech and language processing , 2019, Language, Cognition and Neuroscience.

[37]  Andrea E. Martin,et al.  Language Processing as Cue Integration: Grounding the Psychology of Language in Perception and Neurophysiology , 2016, Front. Psychol..

[38]  Andrea E. Martin,et al.  Phase synchronization varies systematically with linguistic structure composition , 2019, Philosophical Transactions of the Royal Society B.

[39]  David Poeppel,et al.  Neural dynamics of phoneme sequencing in real speech jointly encode order and invariant content , 2020 .

[40]  Edmund C. Lalor,et al.  Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing , 2015, Current Biology.

[41]  G. Buzsáki,et al.  Neuronal Oscillations in Cortical Networks , 2004, Science.

[42]  J. Lisman,et al.  The Theta-Gamma Neural Code , 2013, Neuron.

[43]  M. R. Jones,et al.  Dynamic attending and responses to time. , 1989, Psychological review.

[44]  D. Besner,et al.  Reading aloud: qualitative differences in the relation between stimulus quality and word frequency as a function of context. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[45]  David Poeppel,et al.  Speech rhythms and their neural foundations , 2020, Nature Reviews Neuroscience.

[46]  D. Eagleman Human time perception and its illusions , 2008, Current Opinion in Neurobiology.

[47]  C. Schroeder,et al.  Low-frequency neuronal oscillations as instruments of sensory selection , 2009, Trends in Neurosciences.

[48]  Lars Hausfeld,et al.  A 7T fMRI study investigating the influence of oscillatory phase on syllable representations , 2016, NeuroImage.

[49]  Andrea E. Martin,et al.  How Computational Modeling Can Force Theory Building in Psychological Science , 2021, Perspectives on psychological science : a journal of the Association for Psychological Science.

[50]  Amalia Arvaniti,et al.  Rhythm, Timing and the Timing of Rhythm , 2009, Phonetica.

[51]  Andrea E Martin,et al.  A mechanism for the cortical computation of hierarchical linguistic structure , 2017, PLoS biology.

[52]  J. Macke,et al.  Neural population coding: combining insights from microscopic and mass signals , 2015, Trends in Cognitive Sciences.

[53]  Alexander T Sack,et al.  Oscillatory phase shapes syllable perception , 2015, Proceedings of the National Academy of Sciences.

[54]  Stephen Monsell,et al.  The nature and locus of word frequency effects in reading. , 2012 .

[55]  D Deacon,et al.  Variation in the latencies and amplitudes of N400 and NA as a function of semantic priming. , 1995, Psychophysiology.

[56]  Antje S. Meyer,et al.  Linguistic Structure and Meaning Organize Neural Oscillations into a Content-Specific Hierarchy , 2020, The Journal of Neuroscience.

[57]  Piera Filippi,et al.  Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages , 2016, Front. Hum. Neurosci..

[58]  Andrea E. Martin,et al.  Learning structured representations from experience , 2018 .

[59]  D. Poeppel,et al.  Neural Response Phase Tracks How Listeners Learn New Acoustic Representations , 2013, Current Biology.

[60]  Anahita Basirat,et al.  High-frequency neural activity predicts word parsing in ambiguous speech streams. , 2016, Journal of neurophysiology.

[61]  Oded Ghitza,et al.  On the Role of Theta-Driven Syllabic Parsing in Decoding Speech: Intelligibility of Speech with a Manipulated Modulation Spectrum , 2012, Front. Psychology.

[62]  C. Kayser,et al.  Neural Entrainment and Attentional Selection in the Listening Brain , 2019, Trends in Cognitive Sciences.

[63]  Sanne ten Oever,et al.  Audio-visual onset differences are used to determine syllable identity for ambiguous audio-visual stimulus pairs , 2013, Front. Psychol..

[64]  Francis Nolan,et al.  Speech rhythm: a metaphor? , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[65]  H. Kennedy,et al.  Alpha-Beta and Gamma Rhythms Subserve Feedback and Feedforward Influences among Human Visual Cortical Areas , 2016, Neuron.

[66]  J. O’Keefe,et al.  Phase relationship between hippocampal place units and the EEG theta rhythm , 1993, Hippocampus.

[67]  R. Ulrich,et al.  Perceived duration of expected and unexpected stimuli , 2006, Psychological research.

[68]  Steven Greenberg,et al.  On the Possible Role of Brain Rhythms in Speech Perception: Intelligibility of Time-Compressed Speech with Periodic and Aperiodic Insertions of Silence , 2009, Phonetica.

[69]  Sanne Ten Oever,et al.  Phase-Coded Oscillatory Ordering Promotes the Separation of Closely Matched Representations to Optimize Perceptual Discrimination , 2020, iScience.

[70]  Elissa L. Newport,et al.  Statistical Learning of Syntax: The Role of Transitional Probability , 2007 .

[71]  P. Tse,et al.  Time and the Brain: How Subjective Time Relates to Neural Time , 2005 .

[72]  Andrea E. Martin A Compositional Neural Architecture for Language , 2020, Journal of Cognitive Neuroscience.

[73]  S. Rosen Temporal information in speech: acoustic, auditory and linguistic aspects. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[74]  J. Vroomen,et al.  Perception of intersensory synchrony: A tutorial review , 2010, Attention, perception & psychophysics.

[75]  Aniruddh D. Patel,et al.  Temporal modulations in speech and music , 2017, Neuroscience & Biobehavioral Reviews.

[76]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[77]  Matthijs A. A. van der Meer,et al.  Theta phase precession beyond the hippocampus , 2012, Reviews in the neurosciences.

[78]  E. Large,et al.  The dynamics of attending: How people track time-varying events. , 1999 .

[79]  G. Karmos,et al.  Entrainment of Neuronal Oscillations as a Mechanism of Attentional Selection , 2008, Science.

[80]  Luc H. Arnal,et al.  Proactive Sensing of Periodic and Aperiodic Auditory Patterns , 2018, Trends in Cognitive Sciences.

[81]  Michael C. Doyle,et al.  Effects of frequency on visual word recognition tasks: where are they? , 1989, Journal of experimental psychology. General.