Model-Based Active Learning in Hierarchical Policies
暂无分享,去创建一个
[1] Christos Dimitrakakis,et al. TORCS, The Open Racing Car Simulator , 2005 .
[2] Harold J. Kushner,et al. A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise , 1963 .
[3] Lihong Li,et al. Incremental Model-based Learners With Formal Learning-Time Guarantees , 2006, UAI.
[4] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[5] Prasad Tadepalli,et al. Model-based Hierarchical Average-reward Reinforcement Learning , 2002, International Conference on Machine Learning.
[6] Nando de Freitas,et al. Active Policy Learning for Robot Planning and Exploration under Uncertainty , 2007, Robotics: Science and Systems.
[7] C. D. Perttunen,et al. Lipschitzian optimization without the Lipschitz constant , 1993 .
[8] Simon Streltsov,et al. A Non-myopic Utility Function for Statistical Global Optimization Algorithms , 1999, J. Glob. Optim..
[9] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[10] Bruno Betrò,et al. Bayesian methods in global optimization , 1991, J. Glob. Optim..
[11] Piotr J. Gmytrasiewicz,et al. Interactive dynamic influence diagrams , 2007, AAMAS '07.
[12] Donald R. Jones,et al. Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..
[13] M. Ghavamzadeh,et al. Hierarchical reinforcement learning in continuous state and multi-agent environments , 2005 .
[14] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[15] Daphne Koller,et al. Active Learning for Structure in Bayesian Networks , 2001, IJCAI.
[16] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[17] David Andre,et al. Generalized Prioritized Sweeping , 1997, NIPS.
[18] Michael L. Littman,et al. A hierarchical approach to efficient reinforcement learning in deterministic domains , 2006, AAMAS '06.
[19] Bhaskara Marthi,et al. Concurrent Hierarchical Reinforcement Learning , 2005, IJCAI.
[20] Marco Locatelli,et al. Bayesian Algorithms for One-Dimensional Global Optimization , 1997, J. Glob. Optim..
[21] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[22] Tao Wang,et al. Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.
[23] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[24] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[25] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[26] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[27] A. Zilinskas,et al. Global optimization based on a statistical model and simplicial partitioning , 2002 .
[28] Andreas Krause,et al. Near-optimal sensor placements in Gaussian processes , 2005, ICML.
[29] David Andre,et al. Programmable Reinforcement Learning Agents , 2000, NIPS.
[30] Joelle Pineau,et al. A Hierarchical Approach to POMDP Planning and Execution , 2004 .