New Synthesis Techniques for Finite Time Stochastic Adaptive Controllers

Abstract A method for stochastic adaptive control synthesis as applicable to finite-time problems is established. In this technique, denoted as the method of utility costs, control policies are generated from the dynamic programming equations for the closed-loop optimal (CLO) policy by replacing the optimal cost to go at each stage by an approximation. This approximation corresponds to the cost associated with using a prespecified control policy (denoted as the utility control sequence) for all future control decisions. Control policies synthesized by this method are actively adaptive even though utility control sequences may be chosen as passive policies. A result useful for establishing theoretical performance bounds on the synthesized control policy is given. The method of utility costs is applied to defining a new class of actively adaptive control policies C α m m = 1 N − 1 . The control policy C α m is generated by specifying the utility control sequences to be a certain passive policies based on the assumption that only m future measurements will be taken. The use of passive control policies to generate active control policies is desirable, since passive policies can be determined by the solution of deterministic rather than stochastic optimal control problems. The assumption of only m future measurements allows a further reduction in the computation involved. A theoretical performance bound is given for policy C α N − 1 demonstrating its improvement over the performance of the OLF policy. A numerical example is included to study by simulation the relative performance of the C α N − 1 policy as well as to suggest a method for implementation.