Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance Criteria

The application of stochastic automata to adaptive parameter optimization problems is considered. The fundamental problem is that of relating the concepts of automata theory and mathematical psychology learning theory to the usual notion of a performance index in a control system. Consideration is given to a number of possible automata structures, linear and nonlinear. One particular linear model is derived with optimal rather than expedient properties of convergence. A basic feature of this model is that it is based on a system response set of rewards and inactions, the latter being substituted for the more common penalty responses. This choice of response set is directly related to the achievement of the desired behavior. Simulations are described for the maximization of multimodal performance functions intentionally constructed to demonstrate the use of the method in situations where relative extrema occur. An example is also given of the automaton as a direct adaptive controller for a third order control system.