Direct-reinforcement-adaptive-learning neural network control for nonlinear systems

The paper is concerned with the application of reinforcement learning techniques to feedback control of nonlinear systems using neural networks (NN). Even if a good model of the nonlinear system is known, it is often difficult to formulate a control law. The work in this paper addresses this problem by showing how a NN can cope with nonlinearities through reinforcement learning with no preliminary off-line learning phase required. The learning is performed online based on a binary reinforcement signal from a critic without knowing the nonlinearity appearing in the system. The algorithm is derived from Lyapunov stability analysis, so that both system tracking stability and error convergence can be guaranteed in the closed-loop system.

[1]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[2]  Andrew G. Barto,et al.  Connectionist learning for control: an overview , 1990 .

[3]  Marios M. Polycarpou,et al.  High-order neural network structures for identification of dynamical systems , 1995, IEEE Trans. Neural Networks.

[4]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[5]  Frank L. Lewis,et al.  Neural net robot controller with guaranteed tracking performance , 1993, Proceedings of 8th IEEE International Symposium on Intelligent Control.

[6]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Frank L. Lewis,et al.  Control of Robot Manipulators , 1993 .

[8]  K. Narendra,et al.  A new adaptive law for robust adaptation without persistent excitation , 1987 .

[9]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[10]  H. Asada,et al.  A new approach of adaptive reinforcement learning control , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[11]  King-Sun Fu,et al.  Learning control systems--Review and outlook , 1970 .

[12]  V. Gullapalli,et al.  Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.

[13]  Petros A. Ioannou,et al.  Robust adaptive control: a unified approach , 1991 .

[14]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[15]  Nader Sadegh,et al.  A perceptron network for functional identification and control of nonlinear systems , 1993, IEEE Trans. Neural Networks.

[16]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.