A learning automaton approach to trajectory learning and control system design using dynamic recurrent neural networks

A new approach based on the theory of learning automata is presented for training neural networks with recurrent connections and dynamical processing elements. This approach does not require gradient computations and hence affords a simple implementation. Both linear and nonlinear reinforcement actions are suggested which result in specific training algorithms. Applications of the method to two specific problems, viz. learning of continuous-time trajectories and control system design to stabilize a nonlinear dynamical plant, are outlined.<<ETX>>

[1]  Malur K. Sundareshan,et al.  Self-tuning adaptive control of multi-input, multi-output nonlinear systems using multilayer recurrent neural networks with application to synchronous power generators , 1993, IEEE International Conference on Neural Networks.

[2]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[3]  Robert Gardner,et al.  Introduction To Real Analysis , 1994 .

[4]  W. Thomas Miller,et al.  Real-time dynamic control of an industrial manipulator using a neural network-based learning controller , 1990, IEEE Trans. Robotics Autom..

[5]  Andrew G. Barto,et al.  Connectionist learning for control: an overview , 1990 .

[6]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[7]  GrossbergS. Adaptive pattern classification and universal recoding , 1976 .

[8]  A. Karakasoglu,et al.  Neural network-based identification and adaptive control of nonlinear systems: a novel dynamical network architecture and training policy , 1991, [1991] Proceedings of the 30th IEEE Conference on Decision and Control.

[9]  François E. Cellier,et al.  Continuous system modeling , 1991 .

[10]  R. Linsker,et al.  From basic network principles to neural architecture , 1986 .

[11]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Luís B. Almeida,et al.  A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[13]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[14]  M. L. Tsetlin On the Behavior of Finite Automata in Random Media , 1961 .

[15]  A. Sideris,et al.  A multilayered neural network controller , 1988, IEEE Control Systems Magazine.

[16]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[17]  Yoshiaki Ichikawa,et al.  Neural network application for direct feedback controllers , 1992, IEEE Trans. Neural Networks.

[18]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[19]  Tarun Khanna,et al.  Foundations of neural networks , 1990 .

[20]  Richard S. Sutton,et al.  Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.

[21]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[22]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[23]  Jacob Barhen,et al.  Learning a trajectory using adjoint functions and teacher forcing , 1992, Neural Networks.

[24]  M. Kawato,et al.  Hierarchical neural network model for voluntary movement with application to robotics , 1988, IEEE Control Systems Magazine.

[25]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[26]  Stephen Grossberg,et al.  Absolute stability of global pattern formation and parallel memory storage by competitive neural networks , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[27]  Curtis F. Gerald Applied numerical analysis , 1970 .

[28]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[29]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[30]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[32]  Geoffrey E. Hinton,et al.  OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[33]  P. R. Srikanta Kumar A Simple Learning Scheme for Priority Assignment at a Single-Server Queue , 1986, IEEE Trans. Syst. Man Cybern..

[34]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[35]  A. Karakasoglu,et al.  Identification and decentralized adaptive control of robotic manipulators using dynamical neural networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[36]  Ted G. Lewis,et al.  Generalized Feedback Shift Register Pseudorandom Number Algorithm , 1973, JACM.

[37]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[38]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[39]  C.W. Anderson,et al.  Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[40]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[41]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[42]  Abhijit S. Pandya,et al.  Alopex algorithm for adaptive control of dynamical systems , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[43]  Malur K. Sundareshan,et al.  Adaptive excitation and governor control of synchronous generators using multilayer recurrent neural networks , 1992, [1992] Proceedings of the 31st IEEE Conference on Decision and Control.

[44]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[45]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[46]  P. Peretto,et al.  Collective Properties of Neural Networks , 1986 .