Agnostic System Identification for Monte Carlo Planning
暂无分享,去创建一个
[1] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[2] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[3] Alborz Geramifard,et al. Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping , 2008, UAI.
[4] Alex Simpkins,et al. System Identification: Theory for the User, 2nd Edition (Ljung, L.; 1999) [On the Shelf] , 2012, IEEE Robotics & Automation Magazine.
[5] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.
[6] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[7] Alborz Geramifard,et al. Reinforcement learning with misspecified model classes , 2013, 2013 IEEE International Conference on Robotics and Automation.
[8] Richard L. Lewis,et al. Reward Design via Online Gradient Ascent , 2010, NIPS.
[9] Ralf Schoknecht,et al. Optimality of Reinforcement Learning Algorithms with Linear Function Approximation , 2002, NIPS.
[10] Joel Veness,et al. Context Tree Switching , 2011, 2012 Data Compression Conference.
[11] Joel Veness,et al. A Monte-Carlo AIXI Approximation , 2009, J. Artif. Intell. Res..
[12] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[13] Lennart Ljung,et al. System Identification: Theory for the User , 1987 .
[14] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[15] J. Andrew Bagnell,et al. Agnostic System Identification for Model-Based Reinforcement Learning , 2012, ICML.