A Model of Online Temporal-Spatial Integration for Immediacy and Overrule in Discourse Comprehension

During discourse comprehension, information from prior processing is integrated and appears to be immediately accessible. This was remarkably demonstrated by an N400 for “salted” and not “in love” in response to “The peanut was salted/in love.” Discourse overrule was induced by prior discourse featuring the peanut as an animate agent. Immediate discourse overrule requires a model that integrates information at two timescales. One is over the lifetime and includes event knowledge and word semantics. The second is over the discourse in an event context. We propose a model where both are accounted for by temporal-to-spatial integration of experience into distributed spatial representations, providing immediate access to experience accumulated over different timescales. For lexical semantics, this is modeled by a word embedding system trained by sequential exposure to the entire Wikipedia corpus. For discourse, this is modeled by a recurrent reservoir network trained to generate a discourse vector for input sequences of words. The N400 is modeled as the difference between the instantaneous discourse vector and the target word. We predict this model can account for semantic immediacy and discourse overrule. The model simulates lexical priming and discourse overrule in the “Peanut in love” discourse, and it demonstrates that an unexpected word elicits reduced N400 if it is generally related to the event described in prior discourse, and that this effect disappears when the discourse context is removed. This neurocomputational model is the first to simulate immediacy and overrule in discourse-modulated N400, and contributes to characterization of online integration processes in discourse.

[1]  Stefano Fusi,et al.  Why neurons mix: high dimensionality for higher cognition , 2016, Current Opinion in Neurobiology.

[2]  Barak A. Pearlmutter Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[3]  Edward W. Wlotko,et al.  Going the Extra Mile: Effects of Discourse Context on Two Late Positivities During Language Comprehension , 2020, Neurobiology of Language.

[4]  Eneko Agirre,et al.  SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.

[5]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[6]  Mante S. Nieuwland,et al.  When Peanuts Fall in Love: N400 Evidence for the Power of Discourse , 2005, Journal of Cognitive Neuroscience.

[7]  M A Just,et al.  A theory of reading: from eye fixations to comprehension. , 1980, Psychological review.

[8]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[9]  Peter Ford Dominey,et al.  A Model of Corticostriatal Plasticity for Learning Oculomotor Associations and Sequences , 1995, Journal of Cognitive Neuroscience.

[10]  Nelson Cowan,et al.  Working Memory Capacity , 2005 .

[11]  S. Hallam,et al.  Author Correction: A compendium of geochemical information from the Saanich Inlet water column , 2019, Scientific Data.

[12]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[13]  M. Kutas,et al.  Reading senseless sentences: brain potentials reflect semantic incongruity. , 1980, Science.

[14]  Tom Michael Mitchell,et al.  Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.

[15]  P. Goldman-Rakic Circuitry of Primate Prefrontal Cortex and Regulation of Behavior by Representational Memory , 2011 .

[16]  Wei Xu,et al.  End-to-end learning of semantic role labeling using recurrent neural networks , 2015, ACL.

[17]  G. E. Alexander,et al.  Parallel organization of functionally segregated circuits linking basal ganglia and cortex. , 1986, Annual review of neuroscience.

[18]  Manuel G. Calvo,et al.  Working Memory Capacity and Time Course of Predictive Inferences , 2000, Memory.

[19]  T. Carrell,et al.  Central Auditory System Plasticity Associated with Speech Discrimination Training , 1995, Journal of Cognitive Neuroscience.

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[22]  Kara D. Federmeier,et al.  Thirty years and counting: finding meaning in the N400 component of the event-related brain potential (ERP). , 2011, Annual review of psychology.

[23]  Peter Ford Dominey Complex sensory-motor sequence learning based on recurrent state representation and reinforcement learning , 1995, Biological Cybernetics.

[24]  Katherine Nelson,et al.  Wittgenstein and contemporary theories of word learning , 2009 .

[25]  Ashish Vaswani,et al.  Decoding the neural representation of story meanings across languages , 2017, Human brain mapping.

[26]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[27]  Colin M. Brown,et al.  Semantic Integration in Sentences and Discourse: Evidence from the N400 , 1999, Journal of Cognitive Neuroscience.

[28]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[29]  Kara D. Federmeier,et al.  A Rose by Any Other Name: Long-Term Memory Structure and Sentence Processing , 1999 .

[30]  John F. Kolen,et al.  Gradient Calculations for Dynamic Recurrent Neural Networks , 2001 .

[31]  Peter Ford Dominey,et al.  Beyond the word and image: characteristics of a common meaning system for language and vision revealed by functional and structural imaging , 2015, NeuroImage.

[32]  Allyson Ettinger,et al.  Modeling N400 amplitude using vector space models of word representation , 2016, CogSci.

[33]  Peter Ford Dominey,et al.  Reservoir Computing Properties of Neural Dynamics in Prefrontal Cortex , 2016, PLoS Comput. Biol..

[34]  M. Blignaut,et al.  Towards a Transferable and Cost-Effective Plant AFLP Protocol , 2013, PloS one.

[35]  Daniel Jurafsky,et al.  Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[36]  Hiroyuki Shindo,et al.  Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia , 2020, EMNLP.

[37]  P. Hagoort,et al.  The interaction of discourse context and world knowledge in online sentence comprehension. Evidence from the N400 , 2007, Brain Research.

[38]  Colin M. Brown,et al.  The N400 as a function of the level of processing. , 1995, Psychophysiology.

[39]  J. Elman,et al.  Generalized event knowledge activation during online sentence comprehension. , 2012, Journal of memory and language.

[40]  Manuel G. Calvo,et al.  The time course of predictive inferences depends on contextual constraints , 2000 .

[41]  Peter Hagoort,et al.  How the brain makes sense beyond the processing of single words – An MEG study , 2019, NeuroImage.

[42]  Yun Zhu,et al.  Support vector machines and Word2vec for text classification with semantic features , 2015, 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[43]  Marcel Adam Just,et al.  A Model of the Time Course and Content of Reading , 1982, Cogn. Sci..

[44]  W. Kintsch The role of knowledge in discourse comprehension: a construction-integration model. , 1988, Psychological review.

[45]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[46]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[47]  Peter Ford Dominey,et al.  Real-Time Parallel Processing of Grammatical Structure in the Fronto-Striatal System: A Recurrent Network Simulation Study Using Reservoir Computing , 2013, PloS one.

[48]  Xiao-Jing Wang,et al.  The importance of mixed selectivity in complex cognitive tasks , 2013, Nature.

[49]  A. Treisman Perceptual grouping and attention in visual search for features and for objects. , 1982, Journal of experimental psychology. Human perception and performance.

[50]  M. W. Shields An Introduction to Automata Theory , 1988 .

[51]  James L. McClelland,et al.  Modelling the N400 brain potential as change in a probabilistic representation of meaning , 2018, Nature Human Behaviour.

[52]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[53]  S. Kastner,et al.  Attention in the real world: toward understanding its neural basis , 2014, Trends in Cognitive Sciences.

[54]  Robert Oostenveld,et al.  A 204-subject multimodal neuroimaging dataset to study language processing , 2019, Scientific Data.

[55]  Peter Hagoort,et al.  Beyond the sentence given , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[56]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[57]  Jack L. Gallant,et al.  A Continuous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain , 2012, Neuron.

[58]  Matthew W. Crocker,et al.  A Neurocomputational Model of the N400 and the P600 in Language Processing , 2016, Cognitive science.

[59]  Christoph Haller,et al.  Going the extra mile. , 2018, The Journal of thoracic and cardiovascular surgery.

[60]  W. Kintsch,et al.  Time course of priming for associate and inference words in a discourse context , 1988, Memory & cognition.

[61]  A. Chun,et al.  On the brain , 2007, Nature Nanotechnology.

[62]  Hartmut Fitz,et al.  Getting real about Semantic Illusions: Rethinking the functional role of the P600 in language comprehension , 2012, Brain Research.

[63]  J. H. Neely Semantic priming effects in visual word recognition: A selective review of current findings and theories. , 1991 .

[64]  Hiroyuki Shindo,et al.  Studio Ousia's Quiz Bowl Question Answering System , 2018, ArXiv.

[65]  A. Friederici Event-related brain potential studies in language , 2004, Current neurology and neuroscience reports.

[66]  Allyson Ettinger What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models , 2019, Transactions of the Association for Computational Linguistics.

[67]  Thomas L. Griffiths,et al.  Supplementary Information for Natural Speech Reveals the Semantic Maps That Tile Human Cerebral Cortex , 2022 .

[68]  Dušanka Lazarević,et al.  Data from the Human Penguin Project, a cross-national dataset testing social thermoregulation principles , 2019, Scientific Data.

[69]  Yoshua Bengio,et al.  Neural Probabilistic Language Models , 2006 .

[70]  T. Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1999, ECML.

[71]  Peter Hagoort,et al.  When Elephants Fly: Differential Sensitivity of Right and Left Inferior Frontal Gyri to Discourse and World Knowledge , 2009, Journal of Cognitive Neuroscience.

[72]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.