Uncertainty and Exploration in a Restless Bandit Problem
暂无分享,去创建一个
[1] Paolo Viappiani,et al. Thompson Sampling for Bayesian Bandits with Resets , 2013, ADT.
[2] John N. Tsitsiklis,et al. The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..
[3] Angela J. Yu,et al. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.
[4] Paul R. Schrater,et al. Bayesian modeling of human sequential decision-making on the multi-armed bandit problem , 2008 .
[5] Chang‐Jin Kim,et al. Dynamic linear models with Markov-switching , 1994 .
[6] J. Busemeyer,et al. A contribution of cognitive decision models to clinical assessment: decomposing performance on the Bechara gambling task. , 2002, Psychological assessment.
[7] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[8] G. Bower,et al. From conditioning to category learning: an adaptive network model. , 1988 .
[9] Michael D. Lee,et al. Modeling Human Performance in Restless Bandits with Particle Filters , 2009, J. Probl. Solving.
[10] A. Roth,et al. Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .
[11] Ashok K. Agrawala,et al. Thompson Sampling for Dynamic Multi-armed Bandits , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.
[12] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[15] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .
[16] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[17] Eldad Yechiam,et al. Comparison of basic assumptions embedded in learning models for experience-based decision making , 2005, Psychonomic bulletin & review.
[18] R. E. Kalman,et al. New Results in Linear Filtering and Prediction Theory , 1961 .
[19] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[20] P. Stone,et al. The Nature of Belief-Directed Exploratory Choice in Human Decision-Making , 2011, Front. Psychology.
[21] A. Tversky,et al. Advances in prospect theory: Cumulative representation of uncertainty , 1992 .
[22] Timothy E. J. Behrens,et al. Learning the value of information in an uncertain world , 2007, Nature Neuroscience.
[23] P. Dayan,et al. Cortical substrates for exploratory decisions in humans , 2006, Nature.
[24] Iain D. Gilchrist,et al. Testing a Simplified Method for Measuring Velocity Integration in Saccades Using a Manipulation of Target Contrast , 2011, Front. Psychology.
[25] M. Lee,et al. A Bayesian analysis of human decision-making on bandit problems , 2009 .
[26] Angela J. Yu,et al. the trade-off between exploitation and exploration Should I stay or should I go ? How the human brain manages , 2008 .
[27] Jerome R. Busemeyer,et al. Comparison of Decision Learning Models Using the Generalization Criterion Method , 2008, Cogn. Sci..
[28] R. Duncan Luce,et al. Individual Choice Behavior , 1959 .
[29] Ole-Christoffer Granmo,et al. Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters , 2010, IEA/AIE.
[30] E. Wagenmakers,et al. AIC model selection using Akaike weights , 2004, Psychonomic bulletin & review.