Information Theoretic MPC Using Neural Network Dynamics

We introduce an information theoretic model predictive control (MPC) algorithm that is capable of controlling systems with dynamics represented by multi-layer neural networks and subject to complex cost criteria. The proposed approach is validated in two difficult simulation scenarios, a cart-pole swing up and quadrotor navigation task, and on real hardware on a 1/5th scale vehicle in an aggressive driving task.

[1]  Evangelos A. Theodorou,et al.  Model Predictive Path Integral Control: From Theory to Parallel Computation , 2017 .

[2]  James M. Rehg,et al.  Aggressive driving with model predictive path integral control , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Vicenç Gómez,et al.  Real-Time Stochastic Optimal Control for Multi-Agent Quadrotor Systems , 2015, ICAPS.

[4]  David Q. Mayne,et al.  Model predictive control: Recent developments and future promise , 2014, Autom..

[5]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[6]  Stefan Schaal,et al.  Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.

[7]  Evangelos Theodorou,et al.  Relative entropy and free energy dualities: Connections to Path Integral and KL control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[8]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[10]  Stefan Schaal,et al.  Variable Impedance Control - A Reinforcement Learning Approach , 2010, Robotics: Science and Systems.

[11]  Stefan Schaal,et al.  Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.

[12]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[13]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[14]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[15]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  H. Kappen Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.

[17]  S. Joe Qin,et al.  A survey of industrial model predictive control technology , 2003 .

[18]  David Q. Mayne,et al.  Constrained model predictive control: Stability and optimality , 2000, Autom..

[19]  Christopher G. Atkeson,et al.  A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[20]  Judy A. Franklin,et al.  Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..

[21]  Stefan Schaal,et al.  Learning from Demonstration , 1996, NIPS.