A Generalized Path Integral Control Approach to Reinforcement Learning
暂无分享,去创建一个
Stefan Schaal | Evangelos Theodorou | Jonas Buchli | Evangelos A. Theodorou | S. Schaal | E. Theodorou | J. Buchli
[1] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[2] Robert E. Kalaba,et al. Selected Papers On Mathematical Trends In Control Theory , 1977 .
[3] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[4] B. Øksendal. Stochastic differential equations : an introduction with applications , 1987 .
[5] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[6] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .
[7] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .
[8] J. Yong. Relations among ODEs, PDEs, FSDEs, BSDEs, and FBSDEs , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[9] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[10] Christopher G. Atkeson,et al. Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.
[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[12] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[13] Shun-ichi Amari,et al. Natural Gradient Learning for Over- and Under-Complete Bases in ICA , 1999, Neural Computation.
[14] Geoffrey E. Hinton,et al. Using EM for Reinforcement Learning , 2000 .
[15] L. Siciliano. Modelling and Control of Robot Manipulators , 2000 .
[16] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[17] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[18] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[19] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.
[20] H. Kappen. Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.
[21] Emanuel Todorov,et al. Stochastic Optimal Control and Estimation Methods Adapted to the Noise Characteristics of the Sensorimotor System , 2005, Neural Computation.
[22] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[23] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[24] Emanuel Todorov,et al. Linearly-solvable Markov decision problems , 2006, NIPS.
[25] Stefan Schaal,et al. Reinforcement Learning for Parameterized Motor Primitives , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.
[26] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[27] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[28] H. Kappen. An introduction to stochastic control theory, path integrals and reinforcement learning , 2007 .
[29] Emanuel Todorov,et al. General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.
[30] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[31] Hilbert J. Kappen,et al. Graphical Model Inference in Optimal Control of Stochastic Multi-Agent Systems , 2008, J. Artif. Intell. Res..
[32] Jan Peters,et al. Machine Learning for motor skills in robotics , 2008, Künstliche Intell..
[33] Stefan Schaal,et al. Learning to Control in Operational Space , 2008, Int. J. Robotics Res..
[34] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[35] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[36] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[37] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[38] Marc Toussaint,et al. Trajectory prediction: learning to map situations to robot trajectories , 2009, ICML '09.
[39] Jan Peters,et al. Policy Search for Motor Primitives , 2009, Künstliche Intell..
[40] Stefan Schaal,et al. Variable Impedance Control - A Reinforcement Learning Approach , 2010, Robotics: Science and Systems.
[41] Evangelos A. Theodorou,et al. Iterative path integral stochastic optimal control: Theory and applications to motor control , 2011 .
[42] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[43] H. Kappen. A path integral approach to agent planning , 2022 .