Learning of Non-Parametric Control Policies with High-Dimensional State Features
暂无分享,去创建一个
[1] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[2] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.
[3] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[4] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[5] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[6] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[7] Jason Pazis,et al. Non-Parametric Approximate Linear Programming for MDPs , 2011, AAAI.
[8] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[9] Guy Lever,et al. Conditional mean embeddings as regressors , 2012, ICML.
[10] Guy Lever,et al. Modelling transition dynamics in MDPs with RKHS embeddings , 2012, ICML.
[11] Kenji Fukumizu,et al. Hilbert Space Embeddings of POMDPs , 2012, UAI.
[12] Martin A. Riedmiller,et al. Learn to Swing Up and Balance a Real Pole Based on Raw Visual Input Data , 2012, ICONIP.
[13] K. Fukumizu,et al. Kernel Embeddings of Conditional Distributions: A Unified Kernel Framework for Nonparametric Inference in Graphical Models , 2013, IEEE Signal Processing Magazine.
[14] Byron Boots,et al. Hilbert Space Embeddings of Predictive State Representations , 2013, UAI.
[15] Marc Toussaint,et al. Path Integral Control by Reproducing Kernel Hilbert Space Embedding , 2013, IJCAI.
[16] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[17] Gergely Neu,et al. Online learning in episodic Markovian decision processes by relative entropy policy search , 2013, NIPS.
[18] Oliver Kroemer,et al. Learning sequential motor tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.
[19] Jan Peters,et al. Sample-based informationl-theoretic stochastic optimal control , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[20] Peter Englert,et al. Policy Search in Reproducing Kernel Hilbert Space , 2016, IJCAI.