Adaptive planning in human search

How do people plan ahead when searching for rewards? We investigate planning in a foraging task in which participants search for rewards on an infinite two-dimensional grid. Our results show that their search is best-described by a model which searches at least 3 steps ahead. Furthermore, participants do not seem to update their beliefs during planning, but rather treat their initial beliefs as given, a strategy similar to a heuristic called root-sampling. This planning algorithm corresponds well with participants’ behavior in test problems with restricted movement and varying degrees of information, outperforming more complex models. These results enrich our understanding of adaptive planning in complex environments.

[1]  D. Hassabis,et al.  Neural Mechanisms of Hierarchical Planning in a Virtual Subway Network , 2016, Neuron.

[2]  A. D. D. Groot Thought and Choice in Chess , 1978 .

[3]  M. Speekenbrink,et al.  Putting bandits into context: How function learning supports decision making , 2016, bioRxiv.

[4]  P. Dayan,et al.  Adaptive integration of habits into depth-limited planning defines a habitual-goal–directed spectrum , 2016, Proceedings of the National Academy of Sciences.

[5]  Jonathan D. Nelson,et al.  Exploration and generalization in vast spaces 1 , 2017 .

[6]  Karl J. Friston,et al.  Bayesian model selection for group studies , 2009, NeuroImage.

[7]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[8]  Andreas Krause,et al.  Generalization and search in risky environments , 2017, bioRxiv.

[9]  Peter Dayan,et al.  Bonsai Trees in Your Head: How the Pavlovian System Sculpts Goal-Directed Choices by Pruning Decision Trees , 2012, PLoS Comput. Biol..

[10]  Wei Ji Ma,et al.  A computational model for decision tree search , 2017, CogSci.

[11]  Peter Dayan,et al.  Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search , 2013, J. Artif. Intell. Res..

[12]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[13]  Carl E. Rasmussen,et al.  Occam's Razor , 2000, NIPS.

[14]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[15]  Peter Dayan,et al.  Interplay of approximate planning strategies , 2015, Proceedings of the National Academy of Sciences.

[16]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.