A Case study in neural network training with the breeder genetic algorithm

Supervised training from examples of a feed-forward neural network is a classical problem, traditionally tackled by derivative-based methods (DBM) that compute the gradient of the error, such as backpropagation. Conventional methods for non-linear optimization, such as Levenberg-Marquardt, quasi-Newton and conjugate gradient are generally faster and more reliable, provided the objective function has continuous second derivatives. Their main drawbacks are well-known, being one of the more serious the possibility of getting caught in local minima of the error surface. As an alternative, Evolutionary Algorithms (EA) have demonstrated their ability to solve optimization tasks in a wide range of applications. However, their use in the neural network context has concentrated on binary-coded Genetic Algorithms, often without a systematic approach and thus more than often outperformed by even simple DBMs. The result is that EAs are generally considered basically inadequate for this problem. In this paper the possibilities of the Breeder Genetic Algorithm (BGA) to solve the numerical optimization problem are thoroughly explored. A case study is developed and used to tune the BGA for this kind of task, by searching in the space of genetic operators and their parameters, on the one hand, and as a function of selection pressure and population size, on the other. It is found that specific configurations stand out over the rest, in a way that is consistent with previous findings. The importance of precising the relationship between the last two mentioned quantities is also highlighted. In order to assess to some extent the validity of the algorithm, a further batch of experiments is devoted to compare the BGA to a powerful DBM, a global method consisting of conjugate gradient embedded into a simulated annealing schedule. The results show a comparable performance pointing this evolutionary algorithm as a feasible alternative to derivative-based methods.

[1]  D. Fogel Evolutionary algorithms in theory and practice , 1997, Complex..

[2]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[3]  Heinz Mühlenbein,et al.  Fuzzy Recombination for the Breeder Genetic Algorithm , 1995, ICGA.

[4]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[5]  E. Tarantino,et al.  Artificial Neural Networks Optimization by means of Evolutionary Algorithms , 1998 .

[6]  Zijian Zheng,et al.  A Benchmark For Classifier Learning , 1993 .

[7]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[8]  Thomas Bäck,et al.  Evolution Strategies: An Alternative Evolutionary Algorithm , 1995, Artificial Evolution.

[9]  Ernesto Tarantino,et al.  A Comparative Analysis of Evolutionary Algorithms for Function Optimisation , 1996 .

[10]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[11]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[12]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[13]  Byoung-Tak Zhang,et al.  Evolving Optimal Neural Networks Using Genetic Algorithms with Occam's Razor , 1993, Complex Syst..

[14]  Larry R. Medsker,et al.  Genetic Algorithms and Neural Networks , 1995 .

[15]  D. Ackley A connectionist machine for genetic hillclimbing , 1987 .

[16]  Ivanoe De Falco,et al.  Evolutionary Neural Networks for Nonlinear Dynamics Modeling , 1998, PPSN.

[17]  Heinz Mühlenbein,et al.  Predictive Models for the Breeder Genetic Algorithm I. Continuous Parameter Optimization , 1993, Evolutionary Computation.

[18]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[19]  John Moody,et al.  Prediction Risk and Architecture Selection for Neural Networks , 1994 .

[20]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[21]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[22]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[23]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[24]  A. BelancheSecci A Study in Function Optimization with the Breeder Genetic Algorithm , 1999 .

[25]  C. Lee Giles,et al.  What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation , 1998 .