A Lamarckian Hybrid of Differential Evolution and Conjugate Gradients for Neural Network Training

The paper describes two schemes that follow the model of Lamarckian evolution and combine differential evolution (DE), which is a population-based stochastic global search method, with the local optimization algorithm of conjugate gradients (CG). In the first, each offspring is fine-tuned by CG before competing with their parents. In the other CG is used to improve both parents and offspring in a manner that is completely seamless for individuals that survive more than one generation. Experiments involved training weights of feed-forward neural networks to solve three synthetic and four real-life problems. In six out of seven cases the DE–CG hybrid, which preserves and uses information on each solution’s local optimization process, outperformed two recent variants of DE.

[1]  Dario Floreano,et al.  Neuroevolution: from architectures to learning , 2008, Evol. Intell..

[2]  José Neves,et al.  A Lamarckian Approach for Neural Network Training , 2002, Neural Processing Letters.

[3]  Xin Yao,et al.  A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[4]  Zbigniew Michalewicz,et al.  Parameter Control in Evolutionary Algorithms , 2007, Parameter Setting in Evolutionary Algorithms.

[5]  M Maarten Steinbuch,et al.  Uncertainty modelling and structured singular-value computation applied to an electromechanical system , 1992 .

[6]  Janez Brest,et al.  Self-Adapting Control Parameters in Differential Evolution: A Comparative Study on Numerical Benchmark Problems , 2006, IEEE Transactions on Evolutionary Computation.

[7]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[8]  K. Bandurski,et al.  A parallel differential evolution algorithm for neural network training , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[9]  Uday K. Chakraborty,et al.  Advances in Differential Evolution , 2010 .

[10]  Christian Igel,et al.  Operator adaptation in evolutionary computation and its application to structure optimization of neural networks , 2003, Neurocomputing.

[11]  Krzysztof Bandurski,et al.  A Parallel Differential Evolution Algorithm A Parallel Differential Evolution Algorithm , 2006, PARELEC.

[12]  C. M. Reeves,et al.  Function minimization by conjugate gradients , 1964, Comput. J..

[13]  J. Ross Quinlan,et al.  Combining Instance-Based and Model-Based Learning , 1993, ICML.

[14]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[15]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[16]  Brian J. Ross,et al.  A Lamarckian Evolution Strategy for Genetic Algorithms , 1998, Practical Handbook of Genetic Algorithms.

[17]  David G. Stork,et al.  Pattern Classification , 1973 .

[18]  Martin Mandischer A comparison of evolution strategies and backpropagation for neural network training , 2002, Neurocomputing.

[19]  Ralph Judson Smith,et al.  Circuits, devices and systems , 1966 .

[20]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[21]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[22]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[23]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[24]  Amit Konar,et al.  Differential Evolution Using a Neighborhood-Based Mutation Operator , 2009, IEEE Transactions on Evolutionary Computation.

[25]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[26]  W. Land,et al.  A new training algorithm for the general regression neural network , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[27]  Christian Igel,et al.  Optimization of dynamic neural fields , 2001, Neurocomputing.

[28]  Martin T. Hagan,et al.  Comparison of Stochastic Global Optimization Methods to Estimate Neural Network Weights , 2007, Neural Processing Letters.

[29]  Lance D. Chambers,et al.  Practical Handbook of Genetic Algorithms , 1995 .

[30]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1992, Artificial Intelligence.

[31]  Martin A. Riedmiller,et al.  Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms , 1994 .

[32]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[33]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[34]  C. Charalambous,et al.  Conjugate gradient algorithm for efficient training of artifi-cial neural networks , 1990 .

[35]  Joni-Kristian Kämäräinen,et al.  Differential Evolution Training Algorithm for Feed-Forward Neural Networks , 2003, Neural Processing Letters.

[36]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.