论文信息 - Model Predictive Path Integral Control using Covariance Variable Importance Sampling

Model Predictive Path Integral Control using Covariance Variable Importance Sampling

In this paper we develop a Model Predictive Path Integral (MPPI) control algorithm based on a generalized importance sampling scheme and perform parallel optimization via sampling using a Graphics Processing Unit (GPU). The proposed generalized importance sampling scheme allows for changes in the drift and diffusion terms of stochastic diffusion processes and plays a significant role in the performance of the model predictive control algorithm. We compare the proposed algorithm in simulation with a model predictive control version of differential dynamic programming.

[1] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.

[2] A. Friedman. Stochastic Differential Equations and Applications , 1975 .

[3] Ioannis Karatzas,et al. Brownian Motion and Stochastic Calculus , 1987 .

[4] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .

[5] M. James. Controlled markov processes and viscosity solutions , 1994 .

[6] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .

[7] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.

[8] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[9] H. Kappen. An introduction to stochastic control theory, path integrals and reinforcement learning , 2007 .

[10] Vijay Kumar,et al. The GRASP Multiple Micro-UAV Testbed , 2010, IEEE Robotics & Automation Magazine.

[11] Yuval Tassa,et al. Stochastic Differential Dynamic Programming , 2010, Proceedings of the 2010 American Control Conference.

[12] Stefan Schaal,et al. Reinforcement learning of full-body humanoid motor skills , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[13] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[14] Pieter Abbeel,et al. LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information , 2010, Int. J. Robotics Res..

[15] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16] Evangelos Theodorou,et al. Relative entropy and free energy dualities: Connections to Path Integral and KL control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[17] Rami Yusef Hindiyeh,et al. Dynamics and control of drifting in automobiles , 2013 .

[18] Y. Matsuoka,et al. Reinforcement Learning and Synergistic Control of the ACT Hand , 2013, IEEE/ASME Transactions on Mechatronics.

[19] Vicenç Gómez,et al. Policy Search for Path Integral Control , 2014, ECML/PKDD.

[20] Yunpeng Pan,et al. Probabilistic Differential Dynamic Programming , 2014, NIPS.

[21] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[22] Eric Rombokas,et al. GPU Based Path Integral Control with Learned Dynamics , 2015, ArXiv.

[23] H. Kappen,et al. Path integral control and state-dependent feedback. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24] Evangelos Theodorou,et al. Nonlinear Stochastic Control and Information Theoretic Dualities: Connections, Interdependencies and Thermodynamic Interpretations , 2015, Entropy.

[25] Vicenç Gómez,et al. Real-Time Stochastic Optimal Control for Multi-Agent Quadrotor Systems , 2015, ICAPS.