The neural dynamics of auditory word recognition and integration

Listeners recognize and integrate words in rapid and noisy everyday speech by combining expectations about upcoming content with incremental sensory evidence. We present a computational model of word recognition which formalizes this perceptual process in Bayesian decision theory. We fit this model to explain scalp EEG signals recorded as subjects passively listened to a fictional story, revealing both the dynamics of the online auditory word recognition process and the neural correlates of the recognition and integration of words. The model reveals distinct neural processing of words depending on whether or not they can be quickly recognized. While all words trigger a neural response characteristic of probabilistic integration -- voltage modulations predicted by a word's surprisal in context -- these modulations are amplified for words which require more than roughly 100 ms of input to be recognized. We observe no difference in the latency of these neural responses according to words' recognition times.Our results support a two-part model of speech comprehension, combining an eager and rapid process of word recognition with a temporally independent process of word integration.

[1]  Omer Levy,et al.  Shared computational principles for language processing in humans and deep language models , 2022, Nature Neuroscience.

[2]  T. Francart,et al.  Speech Understanding Oppositely Affects Acoustic and Linguistic Neural Tracking in a Speech Rate Manipulation Paradigm , 2022, The Journal of Neuroscience.

[3]  T. Francart,et al.  Neural Markers of Speech Comprehension: Measuring EEG Tracking of Linguistic Speech Representations, Controlling the Speech Acoustics , 2021, The Journal of Neuroscience.

[4]  Stella Biderman,et al.  GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow , 2021 .

[5]  G. Kuperberg,et al.  Word predictability effects are linear, not logarithmic: Implications for probabilistic models of sentence comprehension. , 2021, Journal of memory and language.

[6]  J. Schoffelen,et al.  A hierarchy of linguistic predictions during natural language comprehension , 2020, bioRxiv.

[7]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[8]  Sylvain Baillet,et al.  Two Distinct Neural Timescales for Predictive Speech Processing , 2019, Neuron.

[9]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[10]  Chaz Firestone,et al.  Resource-rationality and dynamic coupling of brains and social environments , 2019, Behavioral and Brain Sciences.

[11]  Ole Jensen,et al.  Specific lexico-semantic predictions are associated with unique spatial and temporal patterns of neural activity , 2018, eLife.

[12]  Gina R. Kuperberg,et al.  A Tale of Two Positivities (and the N400): Distinct neural signatures are evoked by confirmed and violated predictions at different levels of representation , 2018, bioRxiv.

[13]  Edmund C. Lalor,et al.  The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli , 2016, Front. Hum. Neurosci..

[14]  Gina R Kuperberg,et al.  What do we mean by prediction in language comprehension? , 2016, Language, cognition and neuroscience.

[15]  S. Frank,et al.  The ERP response to the amount of information conveyed by words in sentences , 2015, Brain and Language.

[16]  Nathaniel J. Smith,et al.  The effect of word predictability on reading time is logarithmic , 2013, Cognition.

[17]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[18]  Marc Brysbaert,et al.  Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English , 2009, Behavior research methods.

[19]  John J. Foxe,et al.  Resolving precise temporal processing properties of the auditory system using continuous stimuli. , 2009, Journal of neurophysiology.

[20]  D. Norris,et al.  Shortlist B: a Bayesian model of continuous speech recognition. , 2008, Psychological review.

[21]  P. Hagoort The fractionation of spoken language understanding by measuring electrical and magnetic brain signals , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[22]  Kara D. Federmeier Thinking ahead: the role and roots of prediction in language comprehension. , 2007, Psychophysiology.

[23]  Colin M. Brown,et al.  The cascaded nature of lexical selection and integration in auditory sentence processing. , 2006, Journal of experimental psychology. Learning, memory, and cognition.

[24]  Katherine A. DeLong,et al.  Probabilistic word pre-activation during language comprehension inferred from electrical brain activity , 2005, Nature Neuroscience.

[25]  P. Holcomb,et al.  Electrophysiological evidence for the efficiency of spoken word processing , 2002, Biological Psychology.

[26]  Kara D. Federmeier,et al.  A Rose by Any Other Name: Long-Term Memory Structure and Sentence Processing , 1999 .

[27]  Paul D. Allopenna,et al.  Tracking the Time Course of Spoken Word Recognition Using Eye Movements: Evidence for Continuous Mapping Models , 1998 .

[28]  W. Marslen-Wilson Functional parallelism in spoken word-recognition , 1987, Cognition.

[29]  M. Kutas,et al.  Brain potentials during reading reflect word expectancy and semantic association , 1984, Nature.

[30]  F Grosjean,et al.  Spoken word recognition processes and the gating paradigm , 1980, Perception & psychophysics.

[31]  H. Simon,et al.  A Behavioral Model of Rational Choice , 1955 .

[32]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[33]  Kara D. Federmeier,et al.  Thirty years and counting: finding meaning in the N400 component of the event-related brain potential (ERP). , 2011, Annual review of psychology.

[34]  Kara D. Federmeier,et al.  Chapter 1 Time for Meaning: Electrophysiology Provides Insights into the Dynamics of Representation and Processing in Semantic Memory , 2009 .

[35]  P. Hagoort,et al.  The Processing Nature of the N 400 : Evidence fiom Masked Priming , 2007 .

[36]  R. Smits,et al.  Consonant And Vowel Confusion Patterns By American English Listeners , 2003 .

[37]  Peter Hagoort,et al.  The Processing Nature of the N400: Evidence from Masked Priming , 1993, Journal of Cognitive Neuroscience.