Enhancing metacognitive reinforcement learning using reward structures and feedback

How do we learn to think better, and what can we do to promote such metacognitive learning? Here, we propose that cognitive growth proceeds through metacognitive reinforcement learning. We apply this theory to model how people learn how far to plan ahead and test its predictions about the speed of metacognitive learning in two experiments. In the first experiment, we find that our model can discern a reward structure that promotes metacognitive reinforcement learning from one that hinders it. In the second experiment, we show that our model can be used to design a feedback mechanism that enhances metacognitive reinforcement learning in an environment that hinders learning. Our results suggest that modeling metacognitive learning is a promising step towards promoting cognitive growth.

[1]  Amir Dezfouli,et al.  Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..

[2]  Jessica A. Grahn,et al.  Putting brain training to the test , 2010, Nature.

[3]  C. S. Green,et al.  Brain plasticity through the life span: learning to learn and action video games. , 2012, Annual review of neuroscience.

[4]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[5]  David Tolpin,et al.  Selecting Computations: Theory and Applications , 2012, UAI.

[6]  D. Bavelier,et al.  Exercising your brain: a review of human brain plasticity and training-induced learning. , 2008, Psychology and aging.

[7]  L. Green,et al.  Discounting of delayed rewards: Models of individual choice. , 1995, Journal of the experimental analysis of behavior.

[8]  Alexandra B. Morrison,et al.  Does working memory training work? The promise and challenges of enhancing cognition by training working memory , 2011, Psychonomic bulletin & review.

[9]  Leslie Pack Kaelbling,et al.  On the Complexity of Solving Markov Decision Problems , 1995, UAI.

[10]  N. Daw,et al.  Deciding How To Decide: Self-Control and Meta-Decision Making , 2015, Trends in Cognitive Sciences.

[11]  Thomas L. Griffiths,et al.  When to use which heuristic: A rational solution to the strategy selection problem , 2015, CogSci.

[12]  Jonathan D. Cohen,et al.  Toward a Rational and Mechanistic Account of Mental Effort. , 2017, Annual review of neuroscience.

[13]  Peter Dayan,et al.  Interplay of approximate planning strategies , 2015, Proceedings of the National Academy of Sciences.

[14]  Falk Lieder,et al.  Helping people make better decisions using optimal gamification , 2016, CogSci.

[15]  Nan Jiang,et al.  The Dependence of Effective Planning Horizon on Model Accuracy , 2015, AAMAS.

[16]  Thomas L. Griffiths,et al.  Algorithm selection by rational metareasoning as a model of human strategy selection , 2014, NIPS.

[17]  Stuart J. Russell,et al.  Principles of Metareasoning , 1989, Artif. Intell..

[18]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[19]  S. Green How Many Subjects Does It Take To Do A Regression Analysis. , 1991, Multivariate behavioral research.

[20]  Thomas L. Griffiths,et al.  One and Done? Optimal Decisions From Very Few Samples , 2014, Cogn. Sci..

[21]  Camarin E. Rolle,et al.  Video game training enhances cognitive control in older adults , 2013, Nature.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.