Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization
暂无分享,去创建一个
Andreas Krause | Stefan Schaal | Sebastian Trimpe | Angela P. Schoellig | Felix Berkenkamp | Philipp Hennig | Alonso Marco | S. Schaal | Andreas Krause | Philipp Hennig | Felix Berkenkamp | A. Marco | S. Trimpe | Sebastian Trimpe
[1] B. Anderson,et al. Optimal control: linear quadratic methods , 1990 .
[2] A. O'Hagan,et al. Predicting the output from a complex computer code when fast approximations are available , 2000 .
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.
[5] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[6] Alexander I. J. Forrester,et al. Multi-fidelity optimization via surrogate modelling , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[7] Tao Wang,et al. Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.
[8] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[9] Eric Walter,et al. An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..
[10] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[11] Warren B. Powell,et al. The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..
[12] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[13] Ian R. Manchester,et al. Feedback controller parameterizations for Reinforcement Learning , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[14] Andreas Krause,et al. Contextual Gaussian Process Bandit Optimization , 2011, NIPS.
[15] Howie Choset,et al. Using response surfaces and expected improvement to optimize snake robot gait parameters , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Philipp Hennig,et al. Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..
[17] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[18] Jasper Snoek,et al. Multi-Task Bayesian Optimization , 2013, NIPS.
[19] R. D’Andrea,et al. A Self-Tuning LQR Approach Demonstrated on an Inverted Pendulum , 2014 .
[20] Jan Peters,et al. An experimental comparison of Bayesian optimization for bipedal locomotion , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[21] Jonathan P. How,et al. Real-World Reinforcement Learning via Multifidelity Simulators , 2015, IEEE Transactions on Robotics.
[22] Jonathan P. How,et al. Efficient reinforcement learning for robots using informative simulated priors , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[23] Matthias Poloczek,et al. Multi-Information Source Optimization with General Model Discrepancies , 2016 .
[24] Stefan Schaal,et al. Automatic LQR tuning based on Gaussian process global optimization , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[25] Andreas Krause,et al. Bayesian optimization for maximum power point tracking in photovoltaic power plants , 2016, 2016 European Control Conference (ECC).
[26] Kirthevasan Kandasamy,et al. Multi-fidelity Gaussian Process Bandit Optimisation , 2016, J. Artif. Intell. Res..
[27] Andreas Krause,et al. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.