New designs for universal stability in classical adaptive control and reinforcement learning

Many researchers think that neurocontrollers should never be used in real-world applications until firm, unconditional stability theorems for them have been established. This paper explains key ideas from the author's previous paper (1998) which discusses the problem of "universal stability" (in the linear care) and proposes a new solution. New forms of real-time "reinforcement learning" or "approximate dynamic programming", developed for the nonlinear stochastic case, appear to permit this kind of universal stability. They also offer a hope of easier and more reliable convergence in off-line learning applications, such as those discussed in this paper or those required for nonlinear robust control. Challenges for future research are also discussed.

[1]  Anuradha M. Annaswamy,et al.  Stable Adaptive Systems , 1989 .

[2]  Karl Johan Åström,et al.  Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.

[3]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.