Uncertainty and Expectation in Sentence Processing: Evidence From Subcategorization Distributions

There is now considerable evidence that human sentence processing is expectation based: As people read a sentence, they use their statistical experience with their language to generate predictions about upcoming syntactic structure. This study examines how sentence processing is affected by readers' uncertainty about those expectations. In a self-paced reading study, we use lexical subcategorization distributions to factorially manipulate both the strength of expectations and the uncertainty about them. We compare two types of uncertainty: uncertainty about the verb's complement, reflecting the next prediction step; and uncertainty about the full sentence, reflecting an unbounded number of prediction steps. We find that uncertainty about the full structure, but not about the next step, was a significant predictor of processing difficulty: Greater reduction in uncertainty was correlated with increased reading times (RTs). We additionally replicated previously observed effects of expectation violation (surprisal), orthogonal to the effect of uncertainty. This suggests that both surprisal and uncertainty affect human RTs. We discuss the consequences for theories of sentence comprehension.

[1]  Julie C. Sedivy,et al.  Resolving attachment ambiguities with multiple constraints , 1995, Cognition.

[2]  Christopher T. Kello,et al.  Verb-specific constraints in sentence processing: separating effects of lexical preference from garden-paths. , 1993, Journal of experimental psychology. Learning, memory, and cognition.

[3]  Katherine A. DeLong,et al.  Probabilistic word pre-activation during language comprehension inferred from electrical brain activity , 2005, Nature Neuroscience.

[4]  D. Mitchell,et al.  Absence of real evidence against competition during syntactic ambiguity resolution , 2006 .

[5]  R. Baayen,et al.  Analyzing Reaction Times , 2010 .

[6]  R. Levy Expectation-based syntactic comprehension , 2008, Cognition.

[7]  Brian Roark,et al.  Efficient probabilistic top-down and left-corner parsing , 1999, ACL.

[8]  Mark Johnson,et al.  PCFG Models of Linguistic Tree Representations , 1998, CL.

[9]  Richard L. Lewis,et al.  An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval , 2005, Cogn. Sci..

[10]  Marc Brysbaert,et al.  Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English , 2009, Behavior research methods.

[11]  S. Kennison Limitations on the use of verb information during sentence comprehension , 2001, Psychonomic bulletin & review.

[12]  Jeffrey L. Elman,et al.  Cues, Constraints, and Competition in Sentence Processing , 2004 .

[13]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[14]  G. Miller,et al.  Cognitive science. , 1981, Science.

[15]  M. Tanenhaus,et al.  Verb-specific constraints in sentence processing: Separating effects of lexical preference from garden-paths. , 1993 .

[16]  W. Nelson Francis,et al.  FREQUENCY ANALYSIS OF ENGLISH USAGE: LEXICON AND GRAMMAR , 1983 .

[17]  Adrian Staub,et al.  Parallelism and Competition in Syntactic Ambiguity Resolution , 2008, Lang. Linguistics Compass.

[18]  Laura Kallmeyer,et al.  Parsing Beyond Context-Free Grammars , 2010, Cognitive Technologies.

[19]  Reinhold Kliegl,et al.  Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus , 2008, Journal of Eye Movement Research.

[20]  R. Shillcock,et al.  Eye Movements Reveal the On-Line Computation of Lexical Probabilities During Reading , 2003, Psychological science.

[21]  Susanne Gahl,et al.  Verb subcategorization frequencies: American English corpus data, methodological studies, and cross-corpus comparisons , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[22]  Roger Levy,et al.  Surprisal, the PDC, and the primary locus of processing difficulty in relative clauses , 2013, Front. Psychol..

[23]  E. Gibson Linguistic complexity: locality of syntactic dependencies , 1998, Cognition.

[24]  Antal van den Bosch,et al.  Prediction During Natural Language Comprehension. , 2016, Cerebral cortex.

[25]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[26]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[27]  S. Frank,et al.  The ERP response to the amount of information conveyed by words in sentences , 2015, Brain and Language.

[28]  D. Barr,et al.  Random effects structure for confirmatory hypothesis testing: Keep it maximal. , 2013, Journal of memory and language.

[29]  T. Jaeger,et al.  Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. , 2008, Journal of memory and language.

[30]  Susan M. Garnsey,et al.  The Contributions of Verb Bias and Plausibility to the Comprehension of Temporarily Ambiguous Sentences , 1997 .

[31]  G. Altmann,et al.  Incremental interpretation at verbs: restricting the domain of subsequent reference , 1999, Cognition.

[32]  Zhong Chen,et al.  Modeling sentence processing difficulty with a conditional probability calculator , 2014, CogSci.

[33]  B. Velichkovsky,et al.  Eye typing in application: A comparison of two interfacing systems with ALS patients , 2008 .

[34]  Daniel Jurafsky,et al.  A Probabilistic Model of Lexical and Syntactic Access and Disambiguation , 1996, Cogn. Sci..

[35]  Frank Keller,et al.  Data from eye-tracking corpora as evidence for theories of syntactic processing complexity , 2008, Cognition.

[36]  Stephen T. Wu,et al.  Complexity Metrics in an Incremental Right-Corner Parser , 2010, ACL.

[37]  Brian Roark,et al.  Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing , 2009, EMNLP.

[38]  John Hale,et al.  The Information Conveyed by Words in Sentences , 2003, Journal of psycholinguistic research.

[39]  Zhong Chen,et al.  Uncertainty in processing relative clauses across East Asian languages , 2015 .

[40]  Ting Qian,et al.  Rapid Expectation Adaptation during Syntactic Comprehension , 2013, PloS one.

[41]  John Hale,et al.  Uncertainty About the Rest of the Sentence , 2006, Cogn. Sci..

[42]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[43]  L. Tyler,et al.  Graded Effects of Verb Subcategory Preferences on Parsing: Support for Constraint-satisfaction Models , 1997 .

[44]  Edward Gibson,et al.  Consequences of the Serial Nature of Linguistic Input for Sentenial Complexity , 2005, Cogn. Sci..

[45]  Nathaniel J. Smith,et al.  The effect of word predictability on reading time is logarithmic , 2013, Cognition.

[46]  M. Pickering,et al.  Plausibility and recovery from garden paths: An eye-tracking study , 1998 .

[47]  Alec Marantz,et al.  Syntactic context effects in visual word recognition: An MEG study , 2013 .

[48]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[49]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[50]  R. Shillcock,et al.  Low-level predictive inference in reading: the influence of transitional probabilities on eye movements , 2003, Vision Research.

[51]  Marcel Adam Just,et al.  Paradigms and processes in reading comprehension , 1982 .

[52]  Brian Roark,et al.  Probabilistic Top-Down Parsing and Language Modeling , 2001, CL.

[53]  Kara D. Federmeier Thinking ahead: the role and roots of prediction in language comprehension. , 2007, Psychophysiology.

[54]  M. Tanenhaus,et al.  Modeling the Influence of Thematic Fit (and Other Constraints) in On-line Sentence Comprehension , 1998 .