Eager and Memory-Based Non-Parametric Stochastic Search Methods for Learning Control
暂无分享,去创建一个
Herke van Hoof | Abbas Abdolmaleki | David Meger | Victor Barbaros | A. Abdolmaleki | H. V. Hoof | D. Meger | Victor Barbaros
[1] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[2] Luís Paulo Reis,et al. Non-parametric contextual stochastic search , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[3] Gavin Taylor,et al. Kernelized value function approximation for reinforcement learning , 2009, ICML '09.
[4] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.
[5] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[6] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[7] Xin Xu,et al. Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.
[8] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[9] T. Jung,et al. Kernelizing LSPE(λ) , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[10] Luís Paulo Reis,et al. Model-Based Relative Entropy Stochastic Search , 2016, NIPS.
[11] Leslie Pack Kaelbling,et al. Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.
[12] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[13] Oliver Kroemer,et al. A Non-Parametric Approach to Dynamic Programming , 2011, NIPS.
[14] David W. Aha,et al. Learning to Catch: Applying Nearest Neighbor Algorithms to Dynamic Control Tasks , 1994 .
[15] Christopher G. Atkeson,et al. Using locally weighted regression for robot learning , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.
[16] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[17] Marc Toussaint,et al. Path Integral Control by Reproducing Kernel Hilbert Space Embedding , 2013, IJCAI.
[18] Masashi Sugiyama,et al. Policy Search with High-Dimensional Context Variables , 2016, AAAI.
[19] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[20] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[21] Jan Peters,et al. Non-parametric Policy Search with Limited Information Loss , 2017, J. Mach. Learn. Res..
[22] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[23] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[24] Luís Paulo Reis,et al. Contextual Stochastic Search , 2016, GECCO.
[25] Jason Pazis,et al. Non-Parametric Approximate Linear Programming for MDPs , 2011, AAAI.
[26] Olivier Sigaud,et al. Learning compact parameterized skills with a single regression , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).
[27] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[28] Ai Poh Loh,et al. Model-based contextual policy search for data-efficient generalization of robot skills , 2017, Artif. Intell..
[29] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[30] Peter Englert,et al. Policy Search in Reproducing Kernel Hilbert Space , 2016, IJCAI.
[31] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[32] Guy Lever,et al. Modelling transition dynamics in MDPs with RKHS embeddings , 2012, ICML.
[33] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[34] Jing Peng,et al. Efficient Memory-Based Dynamic Programming , 1995, ICML.
[35] Prasad Tadepalli,et al. Scaling Up Average Reward Reinforcement Learning by Approximating the Domain Models and the Value Function , 1996, ICML.
[36] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[37] Guy Lever,et al. Modelling Policies in MDPs in Reproducing Kernel Hilbert Space , 2015, AISTATS.
[38] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[39] Sehoon Ha,et al. Evolutionary optimization for parameterized whole-body dynamic motor skills , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).