Bayesian Hierarchical Reinforcement Learning
暂无分享,去创建一个
Feng Cao | Soumya Ray | Soumya Ray | Feng Cao
[1] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[2] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[3] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[4] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[5] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[6] Mohammad Ghavamzadeh,et al. Bayesian Policy Gradient Algorithms , 2006, NIPS.
[7] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[8] Peter Stone,et al. Hierarchical model-based reinforcement learning: R-max + MAXQ , 2008, ICML '08.
[9] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[10] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[11] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[12] Xin Chen,et al. Model-based learning with Bayesian and MAXQ value function decomposition for hierarchical task , 2010, 2010 8th World Congress on Intelligent Control and Automation.
[13] Alessandro Lazaric,et al. Bayesian Multi-Task Reinforcement Learning , 2010, ICML.
[14] MahadevanSridhar,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003 .
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[17] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[18] Thomas G. Dietterich,et al. Automatic discovery and transfer of MAXQ hierarchies , 2008, ICML '08.
[19] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[20] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[21] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[22] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[23] Stuart J. Russell,et al. A compact, hierarchically optimal Q-function decomposition , 2006, UAI 2006.