EVOLUTIONARY LEARNING OF RICH NEURAL NETWORKS IN THE BAYESIAN MODEL SELECTION FRAMEWORK

In this paper we focus on the problem of using a genetic algorithm for model selection within a Bayesian framework. We propose to reduce the model selection problem to a search problem solved using evolutionary computation to explore a posterior distribution over the model space. As a case study, we introduce ELeaRNT (Evolutionary Learning of Rich Neural Network Topologies), a genetic algorithm which evolves a particular class of models, namely, Rich Neural Networks (RNN), in order to find an optimal domain-specific non-linear function approximator with a good generalization capability. In order to evolve this kind of neural networks, ELeaRNT uses a Bayesian fitness function. The experimental results prove that ELeaRNT using a Bayesian fitness function finds, in a completely automated way, networks well-matched to the analysed problem, with acceptable complexity.

[1]  R. Mykytowycz,et al.  The eye lens as an indicator of age in the wild rabbit in Australia , 1961 .

[2]  A Tikhonov,et al.  Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[3]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[4]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[5]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[6]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[7]  L. Tierney,et al.  Accurate Approximations for Posterior Moments and Marginal Densities , 1986 .

[8]  R. Fletcher Practical Methods of Optimization , 1988 .

[9]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[10]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[11]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[12]  Stephen F. Gull,et al.  Developments in Maximum Entropy Data Analysis , 1989 .

[13]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[14]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[15]  G. Mani,et al.  Learning by gradient descent in function space , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.

[16]  David E. Rumelhart,et al.  Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[17]  Peter J. B. Hancock,et al.  Genetic algorithms and permutation problems: a comparison of recombination operators for neural net structure specification , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[18]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[19]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[20]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[21]  Gary William Flake,et al.  Nonmonotonic activation functions in multilayer perceptrons , 1993 .

[22]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[23]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[24]  Marc Schoenauer,et al.  Genetic Lander: An Experiment in Accurate Neuro-Genetic Control , 1994, PPSN.

[25]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[26]  Peter J. Angeline,et al.  Genetic programming and emergent intelligence , 1994 .

[27]  Peter M. Williams,et al.  Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.

[28]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[29]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[30]  Xin YaoComputational A Population-Based Learning Algorithm Which Learns BothArchitectures and Weights of Neural Networks , 1996 .

[31]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[32]  Giovanna Castellano,et al.  An iterative pruning algorithm for feedforward neural networks , 1997, IEEE Trans. Neural Networks.

[33]  Michael Georgiopoulos,et al.  Coupling weight elimination with genetic algorithms to reduce network size and preserve generalization , 1997, Neurocomputing.

[34]  David J. C. MacKay,et al.  Comparison of Approximate Methods for Handling Hyperparameters , 1999, Neural Computation.

[35]  Wasserman,et al.  Bayesian Model Selection and Model Averaging. , 2000, Journal of mathematical psychology.

[36]  J. Nazuno Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[37]  Refik Soyer,et al.  Bayesian Methods for Nonlinear Classification and Regression , 2004, Technometrics.

[38]  Adrian E. Raftery,et al.  Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data , 2005, Bioinform..

[39]  Matteo Matteucci,et al.  ELeaRNT: Evolutionary Learning of Rich Neural Network Topologies , 2006 .