Teaching Multiple Concepts to a Forgetful Learner

How can we help a forgetful learner learn multiple concepts within a limited time frame? While there have been extensive studies in designing optimal schedules for teaching a single concept given a learner's memory model, existing approaches for teaching multiple concepts are typically based on heuristic scheduling techniques without theoretical guarantees. In this paper, we look at the problem from the perspective of discrete optimization and introduce a novel algorithmic framework for teaching multiple concepts with strong performance guarantees. Our framework is both generic, allowing the design of teaching schedules for different memory models, and also interactive, allowing the teacher to adapt the schedule to the underlying forgetting mechanisms of the learner. Furthermore, for a well-known memory model, we are able to identify a regime of model parameters where our framework is guaranteed to achieve high performance. We perform extensive evaluations using simulations along with real user studies in two concrete applications: (i) an educational app for online vocabulary teaching; and (ii) an app for teaching novices how to recognize animal species from images. Our results demonstrate the effectiveness of our algorithm compared to popular heuristic approaches.

[1]  H. Ebbinghaus Über das Gedächtniss: Untersuchungen zur experimentellen Psychologie , 1885 .

[2]  K. Pearson On the χ 2 Test of Goodness of Fit , 1922 .

[3]  G. Rubin-Rabson Studies in the psychology of memorizing piano music: II. A comparison of massed and distributed practice. , 1940 .

[4]  B. L. Welch The generalisation of student's problems when several different population variances are involved. , 1947, Biometrika.

[5]  Paul Pimsleur A MEMORY SCHEDULE , 1967 .

[6]  O. Tzeng Stimulus Meaningfulness, Encoding Variability, and the Spacing Effect. , 1973 .

[7]  Kristine C. Bloom,et al.  Effects of Massed and Distributed Practice on the Learning and Retention of Second-Language Vocabulary , 1981 .

[8]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986, Journal of experimental psychology. General.

[9]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[10]  M. Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[11]  D. Rubin,et al.  One Hundred Years of Forgetting : A Quantitative Description of Retention , 1996 .

[12]  Thomas D. Wickens Measuring the time course of retention. , 1999 .

[13]  W. Shebilske,et al.  Interlesson spacing and task-related processing during complex skill acquisition. , 1999 .

[14]  John R. Anderson,et al.  Practice and Forgetting Effects on Vocabulary Memory: An Activation-Based Model of the Spacing Effect , 2005, Cogn. Sci..

[15]  D. Balota,et al.  Is Expanded Retrieval Practice a Superior Form of Spaced Retrieval? A Critical Review of the Extant Literature. , 2007 .

[16]  E. Verdaasdonk,et al.  The influence of different training schedules on the learning of psychomotor skills for endoscopic surgery , 2007, Surgical Endoscopy.

[17]  Brian L. Sullivan,et al.  eBird: A citizen-based bird observation network in the biological sciences , 2009 .

[18]  Ed Vul,et al.  Predicting the Optimal Spacing of Study: A Multiscale Context Model of Memory , 2009, NIPS.

[19]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[20]  A. Simmons Distributed Practice and Procedural Memory Consolidation in Musicians’ Skill Learning , 2012 .

[21]  Amin Karbasi,et al.  On Actively Teaching the Crowd to Classify , 2013, NIPS 2013.

[22]  Michael C. Mozer,et al.  Maximizing Students' Retention Via Spaced Review: Practical Guidance From Computational Models Of Memory , 2013, CogSci.

[23]  E. Spruit,et al.  Increasing efficiency of surgical training: effects of spacing practice on skill acquisition and retention in laparoscopy training , 2015, Surgical Endoscopy.

[24]  Andreas Krause,et al.  Submodular Function Maximization , 2014, Tractability.

[25]  Robert V. Lindsey,et al.  Improving Students’ Long-Term Knowledge Retention Through Personalized Review , 2014, Psychological science.

[26]  Andreas Krause,et al.  Near-Optimally Teaching the Crowd to Classify , 2014, ICML.

[27]  T. Stafford,et al.  Tracing the Trajectory of Skill Learning With a Very Large Sample of Online Game Players , 2014, Psychological science.

[28]  Bradley C. Love,et al.  Optimal Teaching for Limited-Capacity Human Learners , 2014, NIPS.

[29]  Xiaojin Zhu,et al.  Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education , 2015, AAAI.

[30]  Thorsten Joachims,et al.  Unbounded Human Learning: Optimal Scheduling for Spaced Repetition , 2016, KDD.

[31]  Burr Settles,et al.  A Trainable Spaced Repetition Model for Language Learning , 2016, ACL.

[32]  S. Andersen,et al.  Cognitive load in distributed and massed practice in virtual reality mastoidectomy simulation , 2016, The Laryngoscope.

[33]  Edwin K. P. Chong,et al.  String Submodular Functions With Curvature Constraints , 2013, IEEE Transactions on Automatic Control.

[34]  Daniel Nikovski,et al.  Submodular Function Maximization for Group Elevator Scheduling , 2017, ICAPS.

[35]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[36]  Andreas Krause,et al.  Selecting Sequences of Items via Submodular Maximization , 2017, AAAI.

[37]  Le Song,et al.  Iterative Machine Teaching , 2017, ICML.

[38]  Bernhard Schölkopf,et al.  Optimizing Human Learning , 2017, ArXiv.

[39]  Yang Song,et al.  The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Jay I. Myung,et al.  Mechanisms Underlying the Spacing Effect in Learning: A Comparison of Three Computational Models , 2018, Journal of experimental psychology. General.

[41]  Sandra Zilles,et al.  An Overview of Machine Teaching , 2018, ArXiv.

[42]  Jingrui He,et al.  Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners , 2018, KDD.

[43]  Sebastian Tschiatschek,et al.  Teaching Inverse Reinforcement Learners via Features and Demonstrations , 2018, NeurIPS.

[44]  Pietro Perona,et al.  Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners , 2018, NeurIPS.

[45]  Volkan Cevher,et al.  Interactive Teaching Algorithms for Inverse Reinforcement Learning , 2019, IJCAI.

[46]  Bernhard Schölkopf,et al.  Enhancing human learning via spaced repetition optimization , 2019, Proceedings of the National Academy of Sciences.

[47]  Volkan Cevher,et al.  Iterative Classroom Teaching , 2018, AAAI.

[48]  R. Nosofsky,et al.  Model-guided search for optimal natural-science-category training exemplars: A work in progress , 2018, Psychonomic Bulletin & Review.