暂无分享,去创建一个
[1] Ronald A. Howard,et al. Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..
[2] H. Robbins,et al. Maximally dependent random variables. , 1976, Proceedings of the National Academy of Sciences of the United States of America.
[3] Stuart J. Russell,et al. Do the right thing - studies in limited rationality , 1991 .
[4] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[5] Russell Greiner,et al. The Budgeted Multi-armed Bandit Problem , 2004, COLT.
[6] P. Maes. How to Do the Right Thing , 1989 .
[7] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[8] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[9] Warren B. Powell,et al. The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..
[10] V. T. Rajan,et al. Bayesian Inference in Monte-Carlo Tree Search , 2010, UAI.
[11] Andrew M. Ross. Computing Bounds on the Expected Maximum of Correlated Normal Variables , 2010 .
[12] Stuart J. Russell,et al. Metareasoning for Monte Carlo Tree Search , 2011 .
[13] Amir Dezfouli,et al. Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..
[14] David Tolpin,et al. MCTS Based on Simple Regret , 2012, AAAI.
[15] Warren B. Powell,et al. The Knowledge Gradient Algorithm for a General Class of Online Learning Problems , 2012, Oper. Res..
[16] David Barber,et al. Bayesian reasoning and machine learning , 2012 .
[17] David Tolpin,et al. Selecting Computations: Theory and Applications , 2012, UAI.
[18] Feng Wu,et al. Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search , 2013, NIPS.
[19] Peter Dayan,et al. Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search , 2013, J. Artif. Intell. Res..
[20] Thomas L. Griffiths,et al. Algorithm selection by rational metareasoning as a model of human strategy selection , 2014, NIPS.
[21] Eric Horvitz,et al. Metareasoning for Planning Under Uncertainty , 2015, IJCAI.
[22] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[23] Matthias Poloczek,et al. Bayesian Optimization with Gradients , 2017, NIPS.
[24] Marcelo G Mattar,et al. Prioritized memory access explains planning and hippocampal replay , 2017, Nature Neuroscience.
[25] Mehdi Keramati,et al. Optimizing the depth and the direction of prospective planning using information values , 2019, PLoS Comput. Biol..