Modeling Second-Language Learning from a Psychological Perspective

Psychological research on learning and memory has tended to emphasize small-scale laboratory studies. However, large datasets of people using educational software provide opportunities to explore these issues from a new perspective. In this paper we describe our approach to the Duolingo Second Language Acquisition Modeling (SLAM) competition which was run in early 2018. We used a well-known class of algorithms (gradient boosted decision trees), with features partially informed by theories from the psychological literature. After detailing our modeling approach and a number of supplementary simulations, we reflect on the degree to which psychological theory aided the model, and the potential for cognitive science and predictive modeling competitions to gain from each other.

[1]  T. Griffiths Manifesto for a new (computational) cognitive revolution , 2015, Cognition.

[2]  Walter D. Fisher On Grouping for Maximum Homogeneity , 1958 .

[3]  Burr Settles,et al.  A Trainable Spaced Repetition Model for Language Learning , 2016, ACL.

[4]  T. Ruch Factors influencing the relative economy of massed and distributed practice in learning. , 1928 .

[5]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.

[6]  Heikki Hyyrö Explaining and Extending the Bit-parallel Approximate String Matching Algorithm of Myers , 2001 .

[7]  Michael J Cortese,et al.  Do the effects of subjective frequency and age of acquisition survive better word frequency norms? , 2011, Quarterly journal of experimental psychology.

[8]  Alfred Kobsa User Modeling and User-Adapted Interaction , 2005, User Modeling and User-Adapted Interaction.

[9]  Jeffrey D. Karpicke,et al.  Test-Enhanced Learning , 2006, Psychological science.

[10]  Joseph E. Beck,et al.  Going Deeper with Deep Knowledge Tracing , 2016, EDM.

[11]  Marc Brysbaert,et al.  Comparing Word Processing Times in Naming, Lexical Decision, and Progressive Demasking: Evidence from Chronolex , 2011, Front. Psychology.

[12]  Neil T. Heffernan,et al.  Incorporating Rich Features into Deep Knowledge Tracing , 2017, L@S.

[13]  Nitin Madnani,et al.  Second Language Acquisition Modeling , 2018, BEA@NAACL-HLT.

[14]  Zachary A. Pardos,et al.  Deep Neural Networks and How They Apply to Sequential Education Data , 2016, L@S.

[15]  A. Baddeley,et al.  Context-dependent memory in two natural environments: on land and underwater. , 1975 .

[16]  M. Chun,et al.  Memory deficits for implicit contextual information in amnesic subjects with hippocampal damage , 1999, Nature Neuroscience.

[17]  Richard C. Atkinson,et al.  Ingredients for a theory of instruction. , 1972 .

[18]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[19]  John R Anderson,et al.  Using a model to compute the optimal schedule of practice. , 2008, Journal of experimental psychology. Applied.

[20]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[21]  D. Rubin,et al.  One Hundred Years of Forgetting : A Quantitative Description of Retention , 1996 .

[22]  T. Yarkoni,et al.  Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning , 2017, Perspectives on psychological science : a journal of the Association for Psychological Science.

[23]  A. D. Groot,et al.  What Is Hard To Learn Is Easy To Forget: The Roles of Word Concreteness, Cognate Status, and Word Frequency in Foreign Language Vocabulary Learning and Forgetting. , 2000 .

[24]  Michael C. Anderson,et al.  Remembering can cause forgetting: retrieval dynamics in long-term memory. , 1994, Journal of experimental psychology. Learning, memory, and cognition.

[25]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[26]  A. Yonelinas The Nature of Recollection and Familiarity: A Review of 30 Years of Research , 2002 .

[27]  M. Brysbaert,et al.  Age-of-acquisition ratings for 30,000 English words , 2012, Behavior research methods.

[28]  Gary Lupyan,et al.  Discovering Psychological Principles by Mining Naturally Occurring Data Sets , 2016, Top. Cogn. Sci..

[29]  H. Pashler,et al.  Distributed practice in verbal recall tasks: A review and quantitative synthesis. , 2006, Psychological bulletin.

[30]  Ed Vul,et al.  Predicting the Optimal Spacing of Study: A Multiscale Context Model of Memory , 2009, NIPS.

[31]  R. Atkinson Optimizing the Learning of a Second-Language Vocabulary. , 1972 .