Receding Horizon Differential Dynamic Programming Under Parametric Uncertainty

Generalized Polynomial Chaos (gPC) theory has been widely used for representing parametric uncertainty in a system, thanks to its ability to propagate uncertainty evolution. In an optimal control context, gPC can be combined with several optimization techniques to achieve a control policy that handles effectively this type of uncertainty. Such a suitable method is Differential Dynamic Programming (DDP), leading to an algorithm that inherits the scalability to high-dimensional systems and fast convergence nature of the latter. In this paper, we expand this combination aiming to acquire probabilistic guarantees on the satisfaction of nonlinear constraints. In particular, we exploit the ability of gPC to express higher order moments of the uncertainty distribution without any Gaussianity assumption and we incorporate chance constraints that lead to expressions involving the state covariance. Furthermore, we demonstrate that by implementing our algorithm in a receding horizon fashion, we are able to compute control policies that effectively reduce the accumulation of uncertainty on the trajectory. The applicability of our method is verified through simulation results on a differential wheeled robot and a quadrotor that perform obstacle avoidance tasks.

[1]  Raktim Bhattacharya,et al.  Optimal Trajectory Generation With Probabilistic System Uncertainty Using Polynomial Chaos , 2011 .

[2]  José Mario Martínez,et al.  Numerical Comparison of Augmented Lagrangian Algorithms for Nonconvex Problems , 2005, Comput. Optim. Appl..

[3]  Sergey Levine,et al.  Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[5]  Gene H. Golub,et al.  Calculation of Gauss quadrature rules , 1967, Milestones in Matrix Computation.

[6]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[7]  Yunpeng Pan,et al.  Efficient Reinforcement Learning via Probabilistic Trajectory Optimization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[8]  William D. Smart,et al.  Receding Horizon Differential Dynamic Programming , 2007, NIPS.

[9]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[10]  Richard B. Nelson,et al.  Simplified calculation of eigenvector derivatives , 1976 .

[11]  Richard D. Braatz,et al.  Offset-free Input-Output Formulations of Stochastic Model Predictive Control Based on Polynomial Chaos Theory , 2019, 2019 American Control Conference (ACC).

[12]  Richard D. Braatz,et al.  Stochastic nonlinear model predictive control with probabilistic constraints , 2014, 2014 American Control Conference.

[13]  Evangelos A. Theodorou,et al.  Constrained Differential Dynamic Programming Revisited , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[14]  C. Karen Liu,et al.  Differential dynamic programming with nonlinear constraints , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Yunpeng Pan,et al.  Numerical Trajectory Optimization for Stochastic Mechanical Systems , 2019, SIAM J. Sci. Comput..

[16]  Evangelos Theodorou,et al.  Discrete-Time Differential Dynamic Programming on Lie Groups: Derivation, Convergence Analysis, and Numerical Results , 2021, IEEE Transactions on Automatic Control.

[17]  D. Xiu Numerical Methods for Stochastic Computations: A Spectral Method Approach , 2010 .

[18]  Ryan P. Russell,et al.  A Hybrid Differential Dynamic Programming Algorithm for Constrained Optimal Control Problems. Part 1: Theory , 2012, Journal of Optimization Theory and Applications.

[19]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[21]  Yuval Tassa,et al.  Stochastic Differential Dynamic Programming , 2010, Proceedings of the 2010 American Control Conference.

[22]  Jun Morimoto,et al.  Minimax differential dynamic programming: application to a biped walking robot , 2003, SICE 2003 Annual Conference (IEEE Cat. No.03TH8734).

[23]  Stefan Schaal,et al.  Learning variable impedance control , 2011, Int. J. Robotics Res..

[24]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[25]  Dongbin Xiu,et al.  The Wiener-Askey Polynomial Chaos for Stochastic Differential Equations , 2002, SIAM J. Sci. Comput..

[26]  Lorenzo Fagiano,et al.  Nonlinear stochastic model predictive control via regularized polynomial chaos expansions , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[27]  Richard D. Braatz,et al.  Generalised polynomial chaos expansion approaches to approximate stochastic model predictive control† , 2013, Int. J. Control.

[28]  Rolf Findeisen,et al.  Efficient stochastic model predictive control based on polynomial chaos expansions for embedded applications , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).