论文信息 - On-line learning optimal control using successive approximation techniques

On-line learning optimal control using successive approximation techniques

The application of learning theory to on-line optimization of unknown or poorly defined plants is discussed. An on-line optimization procedure is achieved by means of a learning algorithm which alters a trainable controller on the basis of an instantaneous performance criterion or subgoal. The subgoal is related to the over-all goal, the integral cost, by means of successive approximations to the Hamilton-Jacobi equation. The resulting piecewise linear controller is implemented by means of an encoder consisting of threshold logic units and a classifier consisting of a set of logic switching functions. The classifier is determined by means of an algorithm developed by Arkadev and Braverman. Features of the learning algorithm are illustrated by minimum-time and minimum-time-fuel problems.

Martin D. Levine | T. Vilis

[1] M. Levine,et al. Learning control heuristics , 1968 .

[2] Martin D. Levine,et al. A two-stage learning control system , 1970 .

[3] K. Fu,et al. A heuristic approach to reinforcement learning control systems , 1965 .

[4] Lloyd Jones. On the choice of subgoals for learning control systems , 1967 .