Numerous methods have been proposed for the design of control systems which learn to function in unknown or partially known environments. Most learning schemes are radical departures from the techniques using continuous adjustment of parameters which grew out of early developments in model reference systems. Principal contributions to the area have been controller models and algorithms. In studying these models, the system is abstracted to such an extent that there is quite often a loss of contact with practical considerations. The objective of this paper is to present some results in the theory of learning control, but also to look again at some of the practical problems encountered in applying a learning controller to a problem. This paper defines the subgoal as a subordinate to the primary goal of minimizing the performance index. It must evaluate each decision one control interval after it is instituted. The subgoal problem is to choose a subgoal which will direct the learning process to the optimal as prescribed by the given performance index. An analytical solution is presented and extended heuristically for the general case. This extended method makes use of the a priori information about the plant. Two other problems are also discussed. A fixed grid is used to partition the state space into control situations, and a method of extending the grid is proposed and evaluated. The controller is initialized using the a priori information, too. A full scale simulation confirms that the proposed methods of choosing the subgoal, extending the fixed grid and initializing the controller are improvements over previous methods.
[1]
K. Fu,et al.
Learning system heuristics
,
1966
.
[2]
B. Chandrasekaran,et al.
On Expediency and Convergence in Variable-Structure Automata
,
1968,
IEEE Trans. Syst. Sci. Cybern..
[3]
Bernard Friedland,et al.
Linear Systems
,
1965
.
[4]
G. Bekey,et al.
Sensitivity of discrete systems to variation of sampling interval
,
1966
.
[5]
R. Sridhar,et al.
A discrete optimal control problem
,
1966
.
[6]
King-Sun Fu,et al.
A variable structure automaton used as a multi-modal searching technique
,
1965
.
[7]
A. Liff,et al.
On the optimum sampling rate for discrete-time modeling of continuous-time systems
,
1966
.
[8]
K. Fu,et al.
A heuristic approach to reinforcement learning control systems
,
1965
.
[9]
J. E. Gibson,et al.
Adaptive Learning Systems
,
2017
.
[10]
King-Sun Fu,et al.
An algorithm for learning without external supervision and its application to learning control systems
,
1966
.
[11]
K. S. Fu,et al.
Learning Control Systems
,
1969
.
[12]
Z. Rekasius,et al.
On an inverse problem in optimal control
,
1964
.
[13]
Lloyd Jones.
On the choice of subgoals for learning control systems
,
1967
.
[14]
K. Fu,et al.
On some reinforcement techniques and their relation to the stochastic approximation
,
1966
.