Genetic Programming Discovers Efficient Learning Rules for the Hidden and Output Layers of Freeforward Neural Networks

The learning method is critical for obtaining good generalisation in neural networks with limited training data. The Standard BackPropagation (SBP) training algorithm suffers from several problems such as sensitivity to the initial conditions and very slow convergence. The aim of this work is to use Genetic Programming (GP) to discover new supervised learning algorithms which can overcome some of these problems. In previous research a new learning algorithms for the output layer has been discovered using GP. By comparing this with SBP on different problems better performance was demonstrated. This paper shows that GP can also discover better learning algorithms for the hidden layers to be used in conjunction with the algorithm previously discovered. Comparing these with SBP on different problems we show they p rovide better performances. This study indicates that there exist many supervised learning algorithms better than SBP and that GP can be used to discover them.

[1]  M.J.J. Holt,et al.  Convergence of back-propagation in neural networks using a log-likelihood cost function , 1990 .

[2]  Xiao-Hu Yu,et al.  Efficient Backpropagation Learning Using Optimal Learning Rate and Momentum , 1997, Neural Networks.

[3]  Esther Levin,et al.  Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..

[4]  L. Darrell Whitley,et al.  Optimizing Neural Networks Using FasterMore Accurate Genetic Search , 1989, ICGA.

[5]  David J. Chalmers,et al.  The Evolution of Learning: An Experiment in Genetic Connectionism , 1991 .

[6]  Martin A. Riedmiller,et al.  Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms , 1994 .

[7]  Amir F. Atiya,et al.  An accelerated learning algorithm for multilayer perceptron networks , 1994, IEEE Trans. Neural Networks.

[8]  Una-May O'Reilly,et al.  Genetic Programming II: Automatic Discovery of Reusable Programs. , 1994, Artificial Life.

[9]  Anastasios N. Venetsanopoulos,et al.  Fast learning algorithms for neural networks , 1992 .

[10]  Dimitris A. Karras,et al.  An efficient constrained training algorithm for feedforward networks , 1995, IEEE Trans. Neural Networks.

[11]  Wolfram Schiffmann,et al.  Speeding Up Backpropagation Algorithms by Using Cross-Entropy Combined with Pattern Normalization , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[12]  John G. Taylor Promise of neural networks , 1993, Perspectives in neural computing.

[13]  H. Kitano Neurogenetic learning: an integrated method of designing and training neural networks using genetic algorithms , 1994 .

[14]  Michael K. Weir,et al.  A method for self-determination of adaptive learning rates in back propagation , 1991, Neural Networks.

[15]  Martin T. Hagan,et al.  Neural network design , 1995 .

[16]  Pierre Baldi,et al.  Gradient descent learning algorithm overview: a general dynamical systems perspective , 1995, IEEE Trans. Neural Networks.

[17]  R Linsker,et al.  From basic network principles to neural architecture: emergence of spatial-opponent cells. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Alberto Tesi,et al.  On the Problem of Local Minima in Backpropagation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[20]  R. Poli,et al.  Evolving neural networks using a dual representation with a combined crossover operator , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[21]  Byoung-Tak Zhang,et al.  Accelerated Learning by Active Example Selection , 1994, Int. J. Neural Syst..

[22]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[23]  Harry A. C. Eaton,et al.  Learning coefficient dependence on training set size , 1992, Neural Networks.

[24]  P. Sunthar,et al.  The generalized proportional-integral-derivative (PID) gradient descent back propagation algorithm , 1995, Neural Networks.

[25]  Terrence J. Sejnowski,et al.  Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.

[26]  Wolfram Schiffmann,et al.  Optimization of the Backpropagation Algorithm for Training Multilayer Perceptrons , 1994 .

[27]  John R. Koza,et al.  Genetic programming 2 - automatic discovery of reusable programs , 1994, Complex Adaptive Systems.

[28]  Dimitris A. Karras,et al.  An efficient constrained learning algorithm with momentum acceleration , 1995, Neural Networks.

[29]  Gustavo Deco,et al.  Two Strategies to Avoid Overfitting in Feedforward Networks , 1997, Neural Networks.

[30]  Sung-Kwon Park,et al.  The geometrical learning of binary neural networks , 1995, IEEE Trans. Neural Networks.

[31]  R. Eckmiller Advanced neural computers , 1990 .

[32]  Tom Tollenaere,et al.  SuperSAB: Fast adaptive back propagation with good scaling properties , 1990, Neural Networks.

[33]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[34]  Etienne Barnard,et al.  Avoiding false local minima by proper initialization of connections , 1992, IEEE Trans. Neural Networks.

[35]  Kishan G. Mehrotra,et al.  An improved algorithm for neural network classification of imbalanced training sets , 1993, IEEE Trans. Neural Networks.

[36]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[37]  Javier R. Movellan,et al.  Benefits of gain: speeded learning and minimal hidden layers in back-propagation networks , 1991, IEEE Trans. Syst. Man Cybern..

[38]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[39]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[40]  Dilip Sarkar,et al.  Methods to speed up error back-propagation learning algorithm , 1995, CSUR.

[41]  Vladimir Cherkassky,et al.  Regularization effect of weight initialization in back propagation networks , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[42]  PoliRiccardo,et al.  Evolving the Topology and the Weights of Neural Networks Using a Dual Representation , 1998 .

[43]  Arjen van Ooyen,et al.  Improving the convergence of the back-propagation algorithm , 1992, Neural Networks.

[44]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[45]  Hui Cheng,et al.  Contrast enhancement for backpropagation , 1996, IEEE Trans. Neural Networks.

[46]  Samy Bengio,et al.  Use of genetic programming for the search of a new learning rule for neural networks , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[47]  Thomas Bäck,et al.  An Overview of Evolutionary Computation , 1993, ECML.

[48]  Thierry Denoeux,et al.  Initializing back propagation networks with prototypes , 1993, Neural Networks.

[49]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[50]  F.M.A. Salam,et al.  Error back-propagation learning using polynomial energy function , 1992, [Proceedings 1992] IEEE International Conference on Systems Engineering.

[51]  Riccardo Poli,et al.  Discovery of backpropagation learning rules using genetic programming , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[52]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[53]  Riccardo Poli,et al.  Efficient Evolution of Asymmetric Recurrent Neural Networks Using a PDGP-inspired Two-Dimensional Representation , 1998, EuroGP.

[54]  Yamashita,et al.  Backpropagation algorithm which varies the number of hidden units , 1989 .

[55]  Alessandro Sperduti,et al.  Speed up learning and network optimization with extended back propagation , 1993, Neural Networks.

[56]  F. Attneave,et al.  The Organization of Behavior: A Neuropsychological Theory , 1949 .