Mixed Strategy for Constrained Stochastic Optimal Control

Choosing control inputs randomly can result in a reduced expected cost in optimal control problems with stochastic constraints, such as stochastic model predictive control (SMPC). We consider a controller with initial randomization, meaning that the controller randomly chooses from K+1 control sequences at the beginning (called K-randimization).It is known that, for a finite-state, finite-action Markov Decision Process (MDP) with K constraints, K-randimization is sufficient to achieve the minimum cost. We found that the same result holds for stochastic optimal control problems with continuous state and action spaces.Furthermore, we show the randomization of control input can result in reduced cost when the optimization problem is nonconvex, and the cost reduction is equal to the duality gap. We then provide the necessary and sufficient conditions for the optimality of a randomized solution, and develop an efficient solution method based on dual optimization. Furthermore, in a special case with K=1 such as a joint chance-constrained problem, the dual optimization can be solved even more efficiently by root finding. Finally, we test the theories and demonstrate the solution method on multiple practical problems ranging from path planning to the planning of entry, descent, and landing (EDL) for future Mars missions.

[1]  Behcet Acikmese,et al.  Convex programming approach to powered descent guidance for mars landing , 2007 .

[2]  Marco Pavone,et al.  Chance-constrained dynamic programming with application to risk-aware robotic space exploration , 2015, Autonomous Robots.

[3]  M. Golombek,et al.  Mars Exploration Rovers Landing Dispersion Analysis , 2004 .

[4]  M Ono,et al.  Chance constrained finite horizon optimal control with nonconvex constraints , 2010, Proceedings of the 2010 American Control Conference.

[5]  Behçet Açikmese,et al.  A Markov chain approach to probabilistic swarm guidance , 2012, 2012 American Control Conference (ACC).

[6]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[7]  Masahiro Ono,et al.  An Efficient Motion Planning Algorithm for Stochastic Dynamic Systems with Constraints on Probability of Failure , 2008, AAAI.

[8]  Panos M. Pardalos,et al.  Convex optimization theory , 2010, Optim. Methods Softw..

[9]  Behçet Açikmese,et al.  Convex Necessary and Sufficient Conditions for Density Safety Constraints in Markov Chain Synthesis , 2015, IEEE Transactions on Automatic Control.

[10]  Graham R. Wood,et al.  Multidimensional bisection: The performance and the context , 1993, J. Glob. Optim..

[11]  Giuseppe Carlo Calafiore,et al.  Research on probabilistic methods for control system design , 2011, Autom..

[12]  J. Stoer,et al.  Introduction to Numerical Analysis , 2002 .

[13]  J. Lofberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004, 2004 IEEE International Conference on Robotics and Automation (IEEE Cat. No.04CH37508).

[14]  Masahiro Ono,et al.  Joint chance-constrained dynamic programming , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[15]  Alberto Bemporad,et al.  Scenario-based model predictive control of stochastic constrained linear systems , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[16]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[17]  A. Huertas,et al.  Analysis of On-Board Hazard Detection and Avoidance for Safe Lunar Landing , 2008, 2008 IEEE Aerospace Conference.

[18]  Eugene A. Feinberg,et al.  On measurability and representation of strategic measures in Markov decision processes , 1996 .

[19]  Vivek S. Borkar,et al.  Convex Analytic Methods in Markov Decision Processes , 2002 .

[20]  Uriel G. Rothblum,et al.  Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes , 2012, Math. Oper. Res..