On-line learning optimal control using successive approximation techniques

The application of learning theory to on-line optimization of unknown or poorly defined plants is discussed. An on-line optimization procedure is achieved by means of a learning algorithm which alters a trainable controller on the basis of an instantaneous performance criterion or subgoal. The subgoal is related to the over-all goal, the integral cost, by means of successive approximations to the Hamilton-Jacobi equation. The resulting piecewise linear controller is implemented by means of an encoder consisting of threshold logic units and a classifier consisting of a set of logic switching functions. The classifier is determined by means of an algorithm developed by Arkadev and Braverman. Features of the learning algorithm are illustrated by minimum-time and minimum-time-fuel problems.