Modeling Human Sentence Processing with Left-Corner Recurrent Neural Network Grammars

In computational linguistics, it has been shown that hierarchical structures make language models (LMs) more human-like. However, the previous literature has been agnostic about a parsing strategy of the hierarchical models. In this paper, we investigated whether hierarchical structures make LMs more human-like, and if so, which parsing strategy is most cognitively plausible. In order to address this question, we evaluated three LMs against human reading times in Japanese with head-final leftbranching structures: Long Short-Term Memory (LSTM) as a sequential model and Recurrent Neural Network Grammars (RNNGs) with top-down and left-corner parsing strategies as hierarchical models. Our computational modeling demonstrated that left-corner RNNGs outperformed top-down RNNGs and LSTM, suggesting that hierarchical and leftcorner architectures are more cognitively plausible than top-down or sequential architectures. In addition, the relationships between the cognitive plausibility and (i) perplexity, (ii) parsing, and (iii) beam size will also be discussed.

[1]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[2]  Nathaniel J. Smith,et al.  The effect of word predictability on reading time is logarithmic , 2013, Cognition.

[3]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[4]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[5]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[6]  Daniel Jurafsky,et al.  A Probabilistic Model of Lexical and Syntactic Access and Disambiguation , 1996, Cogn. Sci..

[7]  Noah A. Smith,et al.  What Do Recurrent Neural Network Grammars Learn About Syntax? , 2016, EACL.

[8]  Masayuki Asahara,et al.  Archiving and Analysing Techniques of the Ultra-Large-Scale Web-Based Corpus Project of NINJAL, Japan , 2014 .

[9]  Adam Goodkind,et al.  Predictive power of word surprisal for reading times is a linear function of language model quality , 2018, CMCL.

[10]  Roger Levy,et al.  On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior , 2020, CogSci.

[11]  S. Frank,et al.  Insensitivity of the Human Sentence-Processing System to Hierarchical Structure , 2011, Psychological science.

[12]  Roger Levy,et al.  Speakers optimize information density through syntactic reduction , 2006, NIPS.

[13]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[14]  Stefan L. Frank,et al.  Human Sentence Processing: Recurrence or Attention? , 2021, CMCL.

[15]  S. Frank,et al.  The ERP response to the amount of information conveyed by words in sentences , 2015, Brain and Language.

[16]  Eugene Charniak,et al.  Entropy Rate Constancy in Text , 2002, ACL.

[17]  Roger Levy,et al.  Structural Supervision Improves Learning of Non-Local Grammatical Dependencies , 2019, NAACL.

[18]  Edouard Grave,et al.  Colorless Green Recurrent Networks Dream Hierarchically , 2018, NAACL.

[19]  Mark Johnson,et al.  Memory requirements and local ambiguities of parsing strategies , 1991 .

[20]  Yohei Oseki,et al.  Effective Batching for Recurrent Neural Network Grammars , 2021, FINDINGS.

[21]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[22]  William Schuler,et al.  Memory-bounded Neural Incremental Parsing for Psycholinguistic Prediction , 2020, IWPT.

[23]  Roger Levy,et al.  Sequential vs. Hierarchical Syntactic Models of Human Incremental Sentence Processing , 2012, CMCL@NAACL-HLT.

[24]  Philip Resnik,et al.  Left-Corner Parsing and Psychological Plausibility , 1992, COLING.

[25]  Christopher D. Manning,et al.  Probabilistic models of word order and syntactic discontinuity , 2005 .

[26]  John Hale,et al.  Finding syntax in human encephalography with beam search , 2018, ACL.

[27]  Tal Linzen,et al.  A Neural Model of Adaptation in Reading , 2018, EMNLP.

[28]  R. Levy Expectation-based syntactic comprehension , 2008, Cognition.

[29]  Dan Klein,et al.  Effective Inference for Generative Neural Parsing , 2017, EMNLP.

[30]  John Hale,et al.  LSTMs Can Learn Syntax-Sensitive Dependencies Well, But Modeling Structure Makes Them Better , 2018, ACL.

[31]  Yohei Oseki,et al.  Lower Perplexity is Not Always Human-Like , 2021, ACL/IJCNLP.