Word surprisal predicts N400 amplitude during reading

We investigated the effect of word surprisal on the EEG signal during sentence reading. On each word of 205 experimental sentences, surprisal was estimated by three types of language model: Markov models, probabilistic phrasestructure grammars, and recurrent neural networks. Four event-related potential components were extracted from the EEG of 24 readers of the same sentences. Surprisal estimates under each model type formed a significant predictor of the amplitude of the N400 component only, with more surprising words resulting in more negative N400s. This effect was mostly due to content words. These findings provide support for surprisal as a generally applicable measure of processing difficulty during language comprehension.

[1]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[2]  L L Elliott,et al.  Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. , 1977, The Journal of the Acoustical Society of America.

[3]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[4]  Frank Keller,et al.  A Model of Discourse Predictions in Human Sentence Processing , 2011, EMNLP.

[5]  P. Holcomb,et al.  Event-related brain potentials elicited by syntactic anomaly , 1992 .

[6]  C. Van Petten,et al.  Prediction during language comprehension: benefits, costs, and ERP components. , 2012, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[7]  Edith Kaan,et al.  Event-Related Potentials and Language Processing: A Brief Overview , 2007, Lang. Linguistics Compass.

[8]  Mark Johnson,et al.  Using Language Models and Latent Semantic Analysis to Characterise the N400m Neural Response , 2011, ALTA.

[9]  Gabriella Vigliocco,et al.  Lexical surprisal as a general predictor of reading time , 2012, EACL.

[10]  Robin L Thompson,et al.  Reading time data for evaluating broad-coverage models of English sentence processing , 2013, Behavior research methods.

[11]  Kara D. Federmeier,et al.  Switching Languages, Switching Palabras (Words): An Electrophysiological Study of Code Switching , 2002, Brain and Language.

[12]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[13]  M. Garrett,et al.  Syntactically Based Sentence Processing Classes: Evidence from Event-Related Brain Potentials , 1991, Journal of Cognitive Neuroscience.

[14]  Frank Keller,et al.  Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure , 2010, ACL.

[15]  Nathaniel J. Smith,et al.  The effect of word predictability on reading time is logarithmic , 2013, Cognition.

[16]  A. Jacobs,et al.  Frequency and predictability effects on event-related potentials during reading , 2006, Brain Research.

[17]  A. Friederici,et al.  Lexical integration: Sequential effects of syntactic and semantic information , 1999, Memory & cognition.

[18]  Stefan L. Frank,et al.  Surprisal-based comparison between a symbolic and a connectionist model of sentence processing , 2009 .

[19]  R. Levy Expectation-based syntactic comprehension , 2008, Cognition.

[20]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[21]  A D Friederici,et al.  Brain responses during sentence reading: visual input affects central processes. , 1999, Neuroreport.

[22]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[23]  M. Kutas,et al.  Brain potentials during reading reflect word expectancy and semantic association , 1984, Nature.

[24]  S. Frank,et al.  Insensitivity of the Human Sentence-Processing System to Hierarchical Structure , 2011, Psychological science.

[25]  Stefan Frank,et al.  Early effects of word surprisal on pupil size during reading , 2012, CogSci.

[26]  Brian Roark,et al.  Probabilistic Top-Down Parsing and Language Modeling , 2001, CL.