Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
暂无分享,去创建一个
[1] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.
[2] J. Ingersoll. Theory of Financial Decision Making , 1987 .
[3] S. Murphy,et al. Optimal dynamic treatment regimes , 2003 .
[4] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[5] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[6] Rémi Munos,et al. Optimistic Planning of Deterministic Systems , 2008, EWRL.
[7] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[8] Marko Bacic,et al. Model predictive control , 2003 .
[9] Lucian Busoniu,et al. Optimistic planning for belief-augmented Markov Decision Processes , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[10] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[12] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..
[13] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[14] Rémi Munos,et al. From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning , 2014, Found. Trends Mach. Learn..
[15] Lucian Busoniu,et al. Optimistic planning for Markov decision processes , 2012, AISTATS.
[16] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[17] Louis Wehenkel,et al. Lazy Planning under Uncertainty by Optimizing Decisions on an Ensemble of Incomplete Disturbance Trees , 2008, EWRL.
[18] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .