Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection

The cognitive mechanisms needed to account for the English past tense have long been a subject of debate in linguistics and cognitive science. Neural network models were proposed early on, but were shown to have clear flaws. Recently, however, Kirov and Cotterell (2018) showed that modern encoder-decoder (ED) models overcome many of these flaws. They also presented evidence that ED models demonstrate humanlike performance in a nonce-word task. Here, we look more closely at the behaviour of their model in this task. We find that (1) the model exhibits instability across multiple simulations in terms of its correlation with human data, and (2) even when results are aggregated across simulations (treating each simulation as an individual human participant), the fit to the human data is not strong—worse than an older rule-based model. These findings hold up through several alternative training regimes and evaluation measures. Although other neural architectures might do better, we conclude that there is still insufficient evidence to claim that neural nets are a good cognitive model for this task.

[1]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[2]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.

[3]  M. McCloskey Networks and Theories: The Place of Connectionism in Cognitive Science , 1991 .

[4]  G. Marcus,et al.  Regular and irregular inflection in the acquisition of German noun plurals , 1992, Cognition.

[5]  Barbara Hannan,et al.  Connectionism and the Mind: An Introduction to Parallel Processing in Networks , 1992 .

[6]  Steven Pinker,et al.  Generalisation of regular and irregular morphological patterns , 1993 .

[7]  G. Marcus The acquisition of the English past tense in children and multilayered connectionist networks , 1995, Cognition.

[8]  J. Elman,et al.  Rethinking Innateness: A Connectionist Perspective on Development , 1996 .

[9]  Sandra A. Thompson,et al.  Three Frequency Effects in Syntax , 1997 .

[10]  G. Marcus Can connectionism save constructivism? , 1998, Cognition.

[11]  Patrick Juola,et al.  A connectionist model of english past tense and plural morphology , 1999, Cogn. Sci..

[12]  J. Pierrehumbert Stochastic phonology , 2001 .

[13]  S. Pinker,et al.  The past and future of the past tense , 2002, Trends in Cognitive Sciences.

[14]  B. Hayes,et al.  Rules vs. analogy in English past tenses: a computational/experimental study , 2003, Cognition.

[15]  Mirjam Ernestus,et al.  Analogical effects in regular past tense production in Dutch , 2004 .

[16]  Thomas L. Griffiths,et al.  Interpolating between types and tokens by estimating power-law generators , 2005, NIPS.

[17]  B. Ambridge,et al.  Children's judgments of regular and irregular novel past-tense forms: new data on the English past-tense debate. , 2010, Developmental psychology.

[18]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[19]  David C. Plaut,et al.  Quasiregularity and Its Discontents: The Legacy of the Past Tense Debate , 2014, Cogn. Sci..

[20]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21]  Timothy O'Donnell,et al.  Productivity and Reuse in Language: A Theory of Linguistic Computation and Storage , 2015 .

[22]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[23]  Katharina Kann,et al.  MED: The LMU System for the SIGMORPHON 2016 Shared Task on Morphological Reinflection , 2016, SIGMORPHON.

[24]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[25]  Tal Linzen,et al.  Distinct patterns of syntactic agreement errors in recurrent networks and humans , 2018, CogSci.

[26]  Ryan Cotterell,et al.  Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate , 2018, TACL.

[27]  Sharon Goldwater,et al.  Context Sensitive Neural Lemmatization with Lematus , 2018, NAACL-HLT.

[28]  B. Ambridge,et al.  Children's Acquisition of the English Past‐Tense: Evidence for a Single‐Route Account From Novel Verb Production Data , 2018, Cognitive science.

[29]  Tal Linzen,et al.  What can linguistics and deep learning contribute to each other? Response to Pater , 2018, Language.

[30]  Tal Linzen What can linguistics and deep learning contribute to each other? Response to Pater , 2019, Language.