A two-stage learning control system

Learning heuristics for an on-line controller are presented, and various aspects of the problem are discussed. The controller is required to achieve optimal regulator control for an unknown process in the face of random disturbances. A computer method of two-stage learning is employed in which the first stage is coarse and attempts to satisfy the terminal boundary conditions on the basis of subgoal learning. This yields an approximation to the optimum control law. Rote learning is also carried out during this time. The second, or tuning stage, improves on this result by a technique of reinforcement learning applied to the integral performance criterion. The effect of varying the parameters associated with the learning algorithm is studied. A discussion of a hybrid computer simulation of a second-order plant subject to one input with two possible levels is presented.