Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning , 2004, Machine Learning.
 Aaron Hertzmann,et al. Robust physics-based locomotion using low-dimensional planning , 2010, ACM Trans. Graph..
 Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
 Zoran Popovic,et al. Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..
 Nicolas Pronost,et al. Interactive Character Animation Using Simulated Physics: A State‐of‐the‐Art Review , 2012, Comput. Graph. Forum.
 Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
 Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
 Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
 Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
 Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
 Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
 Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
 Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
 Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..
 Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
 Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
 Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
 Glen Berseth,et al. DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning , 2017, ACM Trans. Graph..
 Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2017, ICLR.
 Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
 Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.