Approximate Dynamic Programming with Applications

This thesis studies approximate optimal control of nonlinear systems. Particular attention is given to global solutions and to the computation of approximately optimal feedback controllers. The solution to an optimal control problem is characterized by the optimal value function. For a large class of problems the optimal value function must satisfy a Hamilton-Jacobi-Bellman type equation. Two common methods for solving such equations are policy iteration and value iteration. Both these methods are studied in this thesis. An approximate policy iteration algorithm is presented for both the continuous and discrete time settings. It is shown that the sequence produced by this algorithm converges monotonically towards the optimal value function. A multivariate polynomial relaxation algorithm is proposed for linearly constrained discrete time optimal control problems with convex cost. Relaxed value iteration is studied for constrained linear systems with convex piecewise linear cost. It is shown how an explicit piecewise linear control law can be computed and how the resulting lookup table can be reduced efficiently. The on-line implementation of receding horizon controllers, even for linear systems, is usually restricted to systems with slow dynamics. One reason for this is that the delay between measurement and actuation introduced by computing the control signal on-line can severely degrade systems with fast dynamics. A method to improve robustness against such delays and other uncertainties is presented. A case study on the control of DC--DC converters is given. Feasibility of a Relaxed Dynamic Programming algorithm is verified by synthesizing controllers for both a step-down converter and a step-up converter. The control performance is evaluated both in simulations and in real experiments.

[1]  Ruey-Wen Liu,et al.  Construction of Suboptimal Control Sequences , 1967 .

[2]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[3]  D. Lukes Optimal Regulation of Nonlinear Dynamical Systems , 1969 .

[4]  Y. Nishikawa,et al.  A method for suboptimal design of nonlinear feedback systems , 1971 .

[5]  David G. Luenberger,et al.  Introduction to Linear and Nonlinear Programming , 1973 .

[6]  Slobodan Cuk,et al.  A general unified approach to modelling switching-converter power stages , 1976, 1970 IEEE Power Electronics Specialists Conference.

[7]  Slobodan Cuk,et al.  A general unified approach to modelling switching-converter power stages , 1977 .

[8]  N. Z. Shor Class of global minimum bounds of polynomial functions , 1987 .

[9]  R. M. Bass,et al.  On the use of averaging for the analysis of power electronic systems , 1989, 20th Annual IEEE Power Electronics Specialists Conference.

[10]  B. Lehman,et al.  Extensions of averaging theory for power electronic systems , 1994, Proceedings of 1994 Power Electronics Specialist Conference - PESC'94.

[11]  Wei-Min Lu,et al.  Nonlinear optimal control: alternatives to Hamilton-Jacobi equation , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[12]  J. Shamma,et al.  Linear nonquadratic optimal control , 1997, IEEE Trans. Autom. Control..

[13]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[14]  I. Norman Katz,et al.  An Iterative Algorithm for Solving Hamilton-Jacobi Type Equations , 2000, SIAM J. Sci. Comput..

[15]  David Q. Mayne,et al.  Constrained model predictive control: Stability and optimality , 2000, Autom..

[16]  Jan M. Maciejowski,et al.  Predictive control : with constraints , 2002 .

[17]  Pablo A. Parrilo,et al.  Introducing SOSTOOLS: a general purpose sum of squares programming solver , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[18]  P. Antsaklis,et al.  Design of stabilizing switching control laws for discrete- and continuous-time linear systems using piecewise-linear Lyapunov functions , 2002 .

[19]  Jean B. Lasserre,et al.  Semidefinite Programming vs. LP Relaxations for Polynomial Programming , 2002, Math. Oper. Res..

[20]  A. Rantzer,et al.  Suboptimal dynamic programming with error bounds , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[21]  Pablo A. Parrilo,et al.  Semidefinite programming relaxations for semialgebraic problems , 2003, Math. Program..

[22]  Bo Lincoln,et al.  Dynamic Programming and Time-Varying Delay Systems , 2003 .

[23]  J. Löfberg,et al.  Approximations of closed-loop minimax MPC , 2003, CDC.

[24]  A. Papachristodoulou,et al.  Nonlinear control synthesis by sum of squares optimization: a Lyapunov-based approach , 2004, 2004 5th Asian Control Conference (IEEE Cat. No.04EX904).

[25]  Johan Löfberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004 .

[26]  A. Rantzer,et al.  On approximate policy iteration for continuous-time systems , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[27]  Dan Henriksson,et al.  Resource-Constrained Embedded Control and Computing Systems , 2006 .

[28]  A. Rantzer Relaxed dynamic programming in switching systems , 2006 .

[29]  Bo Lincoln,et al.  Relaxing dynamic programming , 2006, IEEE Transactions on Automatic Control.

[30]  Andreas Wernrud,et al.  Computation of approximate value functions for constrained control problems , 2006 .

[31]  A. Wernrud,et al.  Strategies for Computing Switching Feedback Controllers , 2007, 2007 American Control Conference.

[32]  Andreas Wernrud,et al.  Dynamic Model Predictive Control , 2008 .