Structure-Adaptable Neurocontrollers: A Hardware-Friendly Approach

This paper presents a hardware-friendly approach for adapting the structure of a reinforcement, learning-based neurocontroller. An unsupervised clustering algorithm is used to partition the state space of a system and to adapt the size of its reinforcement module. In the wellknown inverted pendulum problem, the system has proven to be much faster than previous neurocontroller approaches. We are currently working on an implementation of the system using field-programmable logic devices.

[1]  A. J. Krijgsman,et al.  Neurocontrol by Reinforcement Learning , 1996 .

[2]  E. Sanchez,et al.  Neural Network Structure Optimization through On-line Hardware Evolution , 1996 .

[3]  Hamid R. Berenji,et al.  Learning and tuning fuzzy logic controllers through reinforcements , 1992, IEEE Trans. Neural Networks.

[4]  G. Edelman Group selection and phasic reentrant signaling a theory of higher brain function , 1982 .

[5]  Howard C. Card,et al.  Parallel Random Number Generation for VLSI Systems Using Cellular Automata , 1989, IEEE Trans. Computers.

[6]  Xin Yao,et al.  Evolutionary design of artificial neural networks with different nodes , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[7]  Eduardo Sanchez,et al.  A platform for co-design and co-synthesis based on FPGA , 1996, Proceedings Seventh IEEE International Workshop on Rapid System Prototyping. Shortening the Path from Specification to Prototype.

[8]  Jeffrey L. Elman,et al.  Learning and Evolution in Neural Networks , 1994, Adapt. Behav..

[9]  Vasant Honavar,et al.  Evolutionary Design of Neural Architectures -- A Preliminary Taxonomy and Guide to Literature , 1995 .

[10]  X. Yao,et al.  Evolutionary Design of Artiicial Neural Networks with Diierent Nodes , 1996 .

[11]  Xin Yao,et al.  Evolutionary Artificial Neural Networks , 1993, Int. J. Neural Syst..

[12]  Ben J. A. Kröse,et al.  Adaptive State Space Quantisation For Reinforcement Learning Of collision-free navigation , 1992, IROS.

[13]  J. M. Aróstegui Vlsi architectures for evolutive neural models , 1995 .

[14]  A. I. Ethem Alpaydin Neural models of incremental supervised and unsupervised learning , 1990 .

[15]  G. Edelman,et al.  The Mindful Brain: Cortical Organization and the Group-Selective Theory of Higher Brain Function , 1978 .

[16]  Chin-Teng Lin,et al.  Reinforcement learning for an ART-based fuzzy adaptive learning control network , 1996, IEEE Trans. Neural Networks.

[17]  C. Jutten,et al.  Gal: Networks That Grow When They Learn and Shrink When They Forget , 1991 .

[18]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning through Symbiotic Evolution , 1996, Machine Learning.

[19]  Andrés Pérez-Uribe,et al.  FPGA Implementation of an Adaptable-Size Neural Network , 1996, ICANN.

[20]  José del R. Millán,et al.  Rapid, safe, and incremental learning of navigation strategies , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[21]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[22]  Howard C. Card,et al.  Cellular automata-based pseudorandom number generators for built-in self-test , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..