Neurocontrollers trained with rules extracted by a genetic assisted reinforcement learning system

This paper proposes a novel system for rule extraction of temporal control problems and presents a new way of designing neurocontrollers. The system employs a hybrid genetic search and reinforcement learning strategy for extracting the rules. The learning strategy requires no supervision and no reference model. The extracted rules are weighted micro rules that operate on small neighborhoods of the admissable control space. A further refinement of the extracted rules is achieved by applying additional genetic search and reinforcement to reduce the number of extracted micro rules. This process results in a smaller set of macro rules which can be used to train a feedforward multilayer perceptron neurocontroller. The micro rules or the macro rules may also be utilized directly in a table look-up controller. As an example of the macro rules-based neurocontroller, we chose four benchmarks. In the first application we verify the capability of our system to learn optimal linear control strategies. The other three applications involve engine idle speed control, bioreactor control, and stabilizing two poles on a moving cart. These problems are highly nonlinear, unstable, and may include noise and delays in the plant dynamics. In terms of retrievals; the neurocontrollers generally outperform the controllers using a table look-up method. Both controllers, though, show robustness against noise disturbances and plant parameter variations.

[1]  G. V. Puslcorius,et al.  Implicit state observation and control with recurrent neural networks for the bioreactor benchmark problem , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[2]  Lee A. Feldkamp,et al.  Neural control systems trained by dynamic gradient methods for automotive applications , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[3]  G. V. Puskorius,et al.  Automotive Engine Idle Speed Control with Recurrent Neural Networks , 1993, 1993 American Control Conference.

[4]  Richard S. Sutton,et al.  Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.

[5]  Bart Kosko,et al.  Adaptive fuzzy systems for backing up a truck-and-trailer , 1992, IEEE Trans. Neural Networks.

[6]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[7]  David B. Fogel,et al.  System Identification Through Simulated Evolution: A Machine Learning Approach to Modeling , 1991 .

[8]  Kumpati S. Narendra,et al.  Gradient methods for the optimization of dynamical systems containing neural networks , 1991, IEEE Trans. Neural Networks.

[9]  J. Holland A mathematical framework for studying learning in classifier systems , 1986 .

[10]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[11]  Alexis P. Wieland,et al.  Evolving Controls for Unstable Systems , 1991 .

[12]  Doraiswami Ramkrishna,et al.  Theoretical investigations of dynamic behavior of isothermal continuous stirred tank biological reactors , 1982 .