Strategic exploration in human adaptive control
暂无分享,去创建一个
Maarten Speekenbrink | Eric Schulz | Edgar D. Klenske | Neil Bramley | Neil R. Bramley | M. Speekenbrink | Eric Schulz
[1] John Langford,et al. Efficient Exploration in Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.
[2] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[3] Tamer Basar,et al. Dual Control Theory , 2001 .
[4] M. Speekenbrink,et al. Putting bandits into context: How function learning supports decision making , 2016, bioRxiv.
[5] Jonathan D. Nelson,et al. Mapping the unknown: The spatially correlated multi-armed bandit , 2017, bioRxiv.
[6] Philipp Hennig,et al. Dual Control for Approximate Bayesian Reinforcement Learning , 2015, J. Mach. Learn. Res..
[7] R. Rescorla,et al. A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .
[8] Pascal Poupart,et al. Bayesian Reinforcement Learning , 2010, Encyclopedia of Machine Learning.
[9] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[10] Jonathan D. Cohen,et al. Humans use directed and random exploration to solve the explore-exploit dilemma. , 2014, Journal of experimental psychology. General.
[11] KrauseAndreas,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2012 .
[12] Yaakov Bar-Shalom,et al. An actively adaptive control for linear systems with random parameters via the dual control approach , 1972, CDC 1972.