A review of optimistic planning in Markov decision processes
暂无分享,去创建一个
[1] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[2] Rémi Munos,et al. Optimistic Planning of Deterministic Systems , 2008, EWRL.
[3] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[4] Rémi Munos,et al. Open Loop Optimistic Planning , 2010, COLT.
[5] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[6] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[7] Louis Wehenkel,et al. Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[8] Nils J. Nilsson,et al. Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[9] Steven M. LaValle,et al. Planning algorithms , 2006 .
[10] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[11] Michael L. Littman,et al. Sample-Based Planning for Continuous Action Markov Decision Processes , 2011, ICAPS.
[12] Jan M. Maciejowski,et al. Predictive control : with constraints , 2002 .
[13] Bart De Schutter,et al. Optimistic planning for sparsely stochastic systems , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[14] Frédérick Garcia,et al. On-Line Search for Solving Markov Decision Processes via Heuristic Sampling , 2004, ECAI.
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[17] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[18] Lucian Busoniu,et al. Optimistic planning for Markov decision processes , 2012, AISTATS.
[19] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[20] Bart De Schutter,et al. Approximate dynamic programming with a fuzzy parameterization , 2010, Autom..
[21] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.