Multi-Armed Bandits for Human-Machine Decision Making
暂无分享,去创建一个
[1] Jonathan D. Cohen,et al. Humans use directed and random exploration to solve the explore-exploit dilemma. , 2014, Journal of experimental psychology. General.
[2] Yi Gai,et al. Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).
[3] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[4] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[5] Vaibhav Srivastava,et al. Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis , 2015, ArXiv.
[6] Paul B. Reverdy. Modeling Human Decision-making in Multi-armed Bandits , 2013 .
[7] Steven Kay,et al. Fundamentals Of Statistical Signal Processing , 2001 .
[8] Vaibhav Srivastava,et al. Satisficing in Multi-Armed Bandit Problems , 2015, IEEE Transactions on Automatic Control.
[9] Aurélien Garivier,et al. On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.
[10] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[11] Schrater Paul. Structure learning in human sequential decision-making , 2009 .
[12] M. Lee,et al. A Bayesian analysis of human decision-making on bandit problems , 2009 .
[13] Naomi Ehrich Leonard,et al. Parameter Estimation in Softmax Decision-Making Models With Linear Objective Functions , 2015, IEEE Transactions on Automation Science and Engineering.