Using trajectory data to improve bayesian optimization for reinforcement learning
暂无分享,去创建一个
Alan Fern | Prasad Tadepalli | Aaron Wilson | Alan Fern | Prasad Tadepalli | A. Wilson | P. Tadepalli | Aaron Wilson
[1] Amiel Feinstein,et al. Information and information stability of random variables and processes , 1964 .
[2] Lamberto Cesari,et al. Optimization-Theory And Applications , 1983 .
[3] C. D. Perttunen,et al. Lipschitzian optimization without the Lipschitz constant , 1993 .
[4] Jonas Mockus,et al. Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..
[5] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[6] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[9] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[10] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[11] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[12] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[13] Mikhail Belkin,et al. Using manifold structure for partially labelled classification , 2002, NIPS 2002.
[14] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[15] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[16] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[17] Nuno Vasconcelos,et al. A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.
[18] Michael O. Duff,et al. Design for an Optimal Probe , 2003, ICML.
[19] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[20] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[22] John D. Lafferty,et al. Diffusion Kernels on Statistical Manifolds , 2005, J. Mach. Learn. Res..
[23] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[24] Mohammad Ghavamzadeh,et al. Bayesian Policy Gradient Algorithms , 2006, NIPS.
[25] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.
[26] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[27] Martin J. Wainwright,et al. Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization , 2007, NIPS.
[28] Tao Wang,et al. Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.
[29] D. Lizotte. Practical bayesian optimization , 2008 .
[30] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[31] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[32] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[33] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.