Adding learning to cellular genetic algorithms for training recurrent neural networks

This paper proposes a hybrid optimization algorithm which combines the efforts of local search (individual learning) and cellular genetic algorithms (GA's) for training recurrent neural networks (RNN's). Each weight of an RNN is encoded as a floating point number, and a concatenation of the numbers forms a chromosome. Reproduction takes place locally in a square grid with each grid point representing a chromosome. Two approaches, Lamarckian and Baldwinian mechanisms, for combining cellular GA's and learning have been compared. Different hill-climbing algorithms are incorporated into the cellular GA's as learning methods. These include the real-time recurrent learning (RTRL) and its simplified versions, and the delta rule. The RTRL algorithm has been successively simplified by freezing some of the weights to form simplified versions. The delta rule, which is the simplest form of learning, has been implemented by considering the RNN's as feedforward networks during learning. The hybrid algorithms are used to train the RNN's to solve a long-term dependency problem. The results show that Baldwinian learning is inefficient in assisting the cellular GA. It is conjectured that the more difficult it is for genetic operations to produce the genotypic changes that match the phenotypic changes due to learning, the poorer is the convergence of Baldwinian learning. Most of the combinations using the Lamarckian mechanism show an improvement in reducing the number of generations required for an optimum network; however, only a few can reduce the actual time taken. Embedding the delta rule in the cellular GA's has been found to be the fastest method. It is also concluded that learning should not be too extensive if the hybrid algorithm is to be benefit from learning.

[1]  Peter G. Korning,et al.  Training neural networks by means of genetic algorithms working on very long chromosomes , 1995, Int. J. Neural Syst..

[2]  L. Darrell Whitley,et al.  Adding Learning to the Cellular Development of Neural Networks: Evolution and the Baldwin Effect , 1993, Evolutionary Computation.

[3]  Bernard Manderick,et al.  Fine-Grained Parallel Genetic Algorithms , 1989, ICGA.

[4]  A. Skinner,et al.  Neural networks in computational materials science: training algorithms , 1995 .

[5]  J. Baldwin A New Factor in Evolution , 1896, The American Naturalist.

[6]  Larry R. Medsker,et al.  Genetic Algorithms and Neural Networks , 1995 .

[7]  Peter Tiño,et al.  Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[8]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[9]  David B. Fogel,et al.  Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .

[10]  Michael C. Mozer,et al.  A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..

[11]  Yoshua Bengio,et al.  LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition , 1995, Neural Computation.

[12]  Geoffrey E. Hinton,et al.  How Learning Can Guide Evolution , 1996, Complex Syst..

[13]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[14]  R. Anderson,et al.  Learning and evolution: a quantitative genetics approach. , 1995, Journal of theoretical biology.

[15]  Man-Wai Mak,et al.  Exploring the effects of Lamarckian and Baldwinian learning in evolving recurrent neural networks , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[16]  W. M. Jenkins,et al.  Genetic Algorithms and Neural Networks , 1999, Neural Networks in the Analysis and Design of Structures.

[17]  Richard K. Belew,et al.  Evolution, Learning, and Culture: Computational Metaphors for Adaptive Algorithms , 1990, Complex Syst..

[18]  Roger J.-B. Wets,et al.  Minimization by Random Search Techniques , 1981, Math. Oper. Res..

[19]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[20]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[21]  Giles Mayley,et al.  Landscapes, Learning Costs, and Genetic Assimilation , 1996, Evolutionary Computation.

[22]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[23]  David H. Ackley,et al.  Interactions between learning and evolution , 1991 .

[24]  Thomas Bäck,et al.  Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..

[25]  Heinrich Braun,et al.  ENZO-M - A Hybrid Approach for Optimizing Neural Networks by Evolution and Learning , 1994, PPSN.

[26]  Vittorio Maniezzo,et al.  Genetic evolution of the topology and weight distribution of neural networks , 1994, IEEE Trans. Neural Networks.

[27]  David R. Jefferson,et al.  Selection in Massively Parallel Genetic Algorithms , 1991, ICGA.

[28]  L. C. Stayton,et al.  On the effectiveness of crossover in simulated evolutionary optimization. , 1994, Bio Systems.

[29]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[30]  Man-Wai Mak,et al.  Empirical Analysis of the Factors that Affect the Baldwin Effect , 1998, PPSN.

[31]  Hiroaki Hattori,et al.  Text-independent speaker recognition using neural networks , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[32]  William E. Hart,et al.  The Role of Development in Genetic Algorithms , 1994, FOGA.

[33]  Yuval Davidor,et al.  A Naturally Occurring Niche and Species Phenomenon: The Model and First Results , 1991, ICGA.

[34]  Donald E. Waagen,et al.  Evolving recurrent perceptrons for time-series modeling , 1994, IEEE Trans. Neural Networks.

[35]  Peter J. Angeline,et al.  An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[36]  Jeffrey L. Elman,et al.  Learning and Evolution in Neural Networks , 1994, Adapt. Behav..

[37]  Ronald J. Williams,et al.  Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .

[38]  L. Darrell Whitley,et al.  Lamarckian Evolution, The Baldwin Effect and Function Optimization , 1994, PPSN.

[39]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[40]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[41]  M. Mak,et al.  Empirical Analysis of the Factors That Aaect the Baldwin Eeect , 1998 .

[42]  W. Hart Adaptive global optimization with local search , 1994 .

[43]  Dana Ron,et al.  The Power of Amnesia , 1993, NIPS.

[44]  Hiroaki Kitano,et al.  Empirical Studies on the Speed of Convergence of Neural Network Training Using Genetic Algorithms , 1990, AAAI.

[45]  David G. Stork,et al.  Evolution and Learning in Neural Networks , 1990, NIPS.

[46]  R. French,et al.  Genes, Phenes and the Baldwin Effect: Learning and Evolution in a Simulated Population , 1994 .

[47]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[48]  M. Mak,et al.  A Cellular Genetic Algorithm for training Recurrent Neural Networks , 1995 .