Cheap but Clever: Human Active Learning in a Bandit Setting
暂无分享,去创建一个
[1] John K Kruschke,et al. Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.
[2] Jonathan D. Cohen,et al. Sequential effects: Superstition or rational behavior? , 2008, NIPS.
[3] M. Lee,et al. A Bayesian analysis of human decision-making on bandit problems , 2009 .
[4] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[5] Michael D. Lee,et al. Psychological models of human and optimal performance in bandit problems , 2011, Cognitive Systems Research.
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Mark A. Olson,et al. An experimental analysis of the bandit problem , 1997 .
[8] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[9] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[10] Warren B. Powell,et al. Optimal Learning , 2022, Encyclopedia of Machine Learning and Data Mining.
[11] Warren B. Powell,et al. The Knowledge Gradient Algorithm for a General Class of Online Learning Problems , 2012, Oper. Res..
[12] Warren B. Powell,et al. A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..
[13] P. Dayan,et al. Cortical substrates for exploratory decisions in humans , 2006, Nature.
[14] Cognitive Models and the Wisdom of Crowds: A Case Study Using the Bandit Problem , 2010 .