Regression Oracles and Exploration Strategies for Short-Horizon Multi-Armed Bandits
暂无分享,去创建一个
[1] Daniele Loiacono,et al. Player Modeling , 2013, Artificial and Computational Intelligence in Games.
[2] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[3] Leslie Pack Kaelbling,et al. Algorithms for multi-armed bandit problems , 2014, ArXiv.
[4] Omar Besbes,et al. Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with Non-Stationary Rewards , 2014, Stochastic Systems.
[5] Georgios N. Yannakakis,et al. Player modeling using self-organization in Tomb Raider: Underworld , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.
[6] P. Whittle. Restless Bandits: Activity Allocation in a Changing World , 1988 .
[7] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[8] Anna Trakoli,et al. Model of Human Occupation: Theory and Application , 2010 .
[9] Jichen Zhu,et al. Exploring Player Trace Segmentation for Dynamic Play Style Prediction , 2021, AIIDE.
[10] Jichen Zhu,et al. Towards Extending Social Exergame Engagement with Agents , 2018, CSCW Companion.
[11] Santiago Ontañón,et al. Player Modeling via Multi-Armed Bandits , 2020, FDG.
[12] Greta C Bernatz,et al. How humans walk: bout duration, steps per bout, and rest duration. , 2008, Journal of rehabilitation research and development.
[13] Predrag Klasnja,et al. Rapidly Personalizing Mobile Health Treatment Policies with Limited Data , 2020, ArXiv.
[14] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[15] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[16] J. Langford,et al. The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.
[17] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[18] Gary Kielhofner,et al. Model of human occupation : theory and application , 1985 .
[19] J. -L. Guo,et al. Weblog patterns and human dynamics with decreasing interest , 2010, 1008.0042.