Constrained Sampling-based Trajectory Optimization using Stochastic Approximation

We propose a sampling-based trajectory optimization methodology for constrained problems. We extend recent works on stochastic search to deal with box control constraints, as well as nonlinear state constraints for discrete dynamical systems. Regarding the former, our strategy is to optimize over truncated parameterized distributions on control inputs. Furthermore, we show how non-smooth penalty functions can be incorporated into our framework to handle state constraints. Simulations on cartpole and quadcopter show that our approach outperforms previous methods on constrained sampling-based optimization, in terms of quality of solutions and convergence speed.

[1]  Vijay Kumar,et al.  The GRASP Multiple Micro-UAV Testbed , 2010, IEEE Robotics & Automation Magazine.

[2]  H. Robbins A Stochastic Approximation Method , 1951 .

[3]  Marc Peter Deisenroth,et al.  Efficient reinforcement learning using Gaussian processes , 2010 .

[4]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[5]  Evangelos Theodorou,et al.  Nonlinear Stochastic Control and Information Theoretic Dualities: Connections, Interdependencies and Thermodynamic Interpretations , 2015, Entropy.

[6]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[7]  Lih-Yuan Deng,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.

[8]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[9]  Ping Hu,et al.  A Stochastic Approximation Framework for a Class of Randomized Optimization Algorithms , 2012, IEEE Transactions on Automatic Control.

[10]  Evangelos A. Theodorou,et al.  Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs , 2019, Robotics: Science and Systems.

[11]  Jiaqiao Hu,et al.  Gradient-Based Adaptive Stochastic Search for Non-Differentiable Optimization , 2013, IEEE Transactions on Automatic Control.

[12]  Evangelos Theodorou,et al.  Path Integral Control on Lie Groups , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[13]  Yunpeng Pan,et al.  Efficient Reinforcement Learning via Probabilistic Trajectory Optimization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[15]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Tim Hesterberg,et al.  Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.

[17]  Stefan Schaal,et al.  Learning variable impedance control , 2011, Int. J. Robotics Res..

[18]  L. Brown Fundamentals of statistical exponential families: with applications in statistical decision theory , 1986 .

[19]  Yunpeng Pan,et al.  Numerical Trajectory Optimization for Stochastic Mechanical Systems , 2019, SIAM J. Sci. Comput..

[20]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[22]  Ufuk Topcu,et al.  Constrained Cross-Entropy Method for Safe Reinforcement Learning , 2020, IEEE Transactions on Automatic Control.

[23]  Evangelos Theodorou,et al.  Stochastic Optimal Control using polynomial chaos variational integrators , 2016, 2016 American Control Conference (ACC).

[24]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Marin Kobilarov Discrete geometric motion control of autonomous vehicles , 2008 .

[26]  Stefan Schaal,et al.  STOMP: Stochastic trajectory optimization for motion planning , 2011, 2011 IEEE International Conference on Robotics and Automation.