Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection

The cognitive mechanisms needed to account for the English past tense have long been a subject of debate in linguistics and cognitive science. Neural network models were proposed early on, but were shown to have clear flaws. Recently, however, Kirov and Cotterell (2018) showed that modern encoder-decoder (ED) models overcome many of these flaws. They also presented evidence that ED models demonstrate humanlike performance in a nonce-word task. Here, we look more closely at the behaviour of their model in this task. We find that (1) the model exhibits instability across multiple simulations in terms of its correlation with human data, and (2) even when results are aggregated across simulations (treating each simulation as an individual human participant), the fit to the human data is not strong—worse than an older rule-based model. These findings hold up through several alternative training regimes and evaluation measures. Although other neural architectures might do better, we conclude that there is still insufficient evidence to claim that neural nets are a good cognitive model for this task.

[1]  David,et al.  IA On Learning the Past Tenses of English Verbs , 2021 .

[2]  Tal Linzen What can linguistics and deep learning contribute to each other? Response to Pater , 2019, Language.

[3]  Tal Linzen,et al.  What can linguistics and deep learning contribute to each other? Response to Pater , 2018, Language.

[4]  Tal Linzen,et al.  Distinct patterns of syntactic agreement errors in recurrent networks and humans , 2018, CogSci.

[5]  Ryan Cotterell,et al.  Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate , 2018, TACL.

[6]  Sharon Goldwater,et al.  Context Sensitive Neural Lemmatization with Lematus , 2018, NAACL-HLT.

[7]  B. Ambridge,et al.  Children's Acquisition of the English Past‐Tense: Evidence for a Single‐Route Account From Novel Verb Production Data , 2018, Cognitive science.

[8]  Katharina Kann,et al.  MED: The LMU System for the SIGMORPHON 2016 Shared Task on Morphological Reinflection , 2016, SIGMORPHON.

[9]  Timothy O'Donnell,et al.  Productivity and Reuse in Language: A Theory of Linguistic Computation and Storage , 2015 .

[10]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[12]  David C. Plaut,et al.  Quasiregularity and Its Discontents: The Legacy of the Past Tense Debate , 2014, Cogn. Sci..

[13]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[14]  B. Ambridge,et al.  Children's judgments of regular and irregular novel past-tense forms: new data on the English past-tense debate. , 2010, Developmental psychology.

[15]  Thomas L. Griffiths,et al.  Interpolating between types and tokens by estimating power-law generators , 2005, NIPS.

[16]  Mirjam Ernestus,et al.  Analogical effects in regular past tense production in Dutch , 2004 .

[17]  B. Hayes,et al.  Rules vs. analogy in English past tenses: a computational/experimental study , 2003, Cognition.

[18]  S. Pinker,et al.  The past and future of the past tense , 2002, Trends in Cognitive Sciences.

[19]  J. Pierrehumbert Stochastic phonology , 2001 .

[20]  Patrick Juola,et al.  A connectionist model of english past tense and plural morphology , 1999, Cogn. Sci..

[21]  G. Marcus Can connectionism save constructivism? , 1998, Cognition.

[22]  Sandra A. Thompson,et al.  Three Frequency Effects in Syntax , 1997 .

[23]  J. Elman,et al.  Rethinking Innateness: A Connectionist Perspective on Development , 1996 .

[24]  G. Marcus The acquisition of the English past tense in children and multilayered connectionist networks , 1995, Cognition.

[25]  Steven Pinker,et al.  Generalisation of regular and irregular morphological patterns , 1993 .

[26]  G. Marcus,et al.  Regular and irregular inflection in the acquisition of German noun plurals , 1992, Cognition.

[27]  Barbara Hannan,et al.  Connectionism and the Mind: An Introduction to Parallel Processing in Networks , 1992 .

[28]  M. McCloskey Networks and Theories: The Place of Connectionism in Cognitive Science , 1991 .

[29]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.