Differential dynamic programming methods for solving bang-bang control problems

Differential dynamic programming is a technique, based on dynamic programming rather than the calculus of variations, for determining the optimal control function of a nonlinear system. Unlike conventional dynamic programming where the optimal cost function is considered globally, differential dynamic programming applies the principle of optimality in the neighborhood of a nominal, possibly nonoptimal, trajectory. This allows the coefficients of a linear or quadratic expansion of the cost function to be computed in reverse time along the trajectory: these coefficients may then be used to yield a new improved trajectory (i.e., the algorithms are of the "successive sweep" type). A class of nonlinear control problems, linear in the control variables, is studied using differential dynamic programming. It is shown that for the free-end-point problem, the first partial derivatives of the optimal cost function are continuous throughout the state space, and the second partial derivatives experience jumps at switch points of the control function. A control problem that has an aualytic solution is used to illustrate these points. The fixed-end-point problem is converted into an equivalent free-end-point problem by adjoining the end-point constraints to the cost functional using Lagrange multipliers: a useful interpretation for Pontryagin's adjoint variables for this type of problem emerges from this treatment. The above results are used to devise new second- and first-order algorithms for determining the optimal bang-bang control by successively improving a nominal guessed control function. The usefulness of the proposed algorithms is illustrated by the computation of a number of control problem examples.

[1]  H. J. Kelley,et al.  Rocket trajectory optimization by a second- order numerical technique. , 1969 .

[2]  S. Mcreynolds,et al.  On optimal control problems with discontinuities , 1968 .

[3]  D. Jacobson Second-order and Second-variation Methods for Determining Optimal Control: A Comparative Study using Differential Dynamic Programming† , 1968 .

[4]  G. Franklin,et al.  A second-order feedback method for optimal control computations , 1967, IEEE Transactions on Automatic Control.

[5]  S. Mcreynolds The successive sweep method and dynamic programming , 1967 .

[6]  M. Athans,et al.  An iterative technique for the computation of time optimal controls. , 1966 .

[7]  A. Bryson,et al.  A SUCCESSIVE SWEEP METHOD FOR SOLVING OPTIMAL PROGRAMMING PROBLEMS , 1965 .

[8]  A. Fuller Study of an Optimum Non-linear Control System† , 1963 .

[9]  B. Paiewonsky,et al.  On synthesizing optimal controls , 1963 .

[10]  L. Berkovitz Variational methods in problems of control and programming , 1961 .

[11]  L. Neustadt Synthesizing time optimal control systems , 1960 .

[12]  R. Bellman,et al.  On the “bang-bang” control problem , 1956 .

[13]  D. Mayne A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[14]  Sanjoy K. Mitter,et al.  Successive approximation methods for the solution of optimal control problems , 1966, Autom..

[15]  H. Knudsen An iterative procedure for computing time-optimal controls , 1964 .

[16]  H. G. Moyer,et al.  A trajectory optimization technique based upon the theory of the second variation. , 1963 .

[17]  B. Paiewonsky Time Optimal Control of Linear Systems with Bounded Controls , 1963 .

[18]  Richard E. Kopp,et al.  SUCCESSIVE APPROXIMATION TECHNIQUES FOR TRAJECTORY OPTIMIZATION , 1961 .

[19]  R. Kálmán THE THEORY OF OPTIMAL CONTROL AND THE CALCULUS OF VARIATIONS , 1960 .