Evolutionary algorithms for hyperparameter tuning on neural networks models

In this work we present a comparison of several Artificial Neural Networks weights initialization methods based on Evolutionary Algorithms. We have tested these methods on three datasets: KEEL regression problems, random synthetic dataset and a dataset of concentration of different chemical species from the Bioethanol To Olefins process. Results demonstrated that the tuning of neural networks initial weights improves significantly their performance compared to a random initialization. In addition, several crossover algorithms were tested to identify the best one for the present objective. In the post-hoc analysis there were found significant differences between the implemented crossover algorithms when the network has four or more inputs.

[1]  G. Hommel,et al.  Improvements of General Multiple Test Procedures for Redundant Systems of Hypotheses , 1988 .

[2]  A. E. Eiben,et al.  Efficient relevance estimation and value calibration of evolutionary algorithm parameters , 2007, 2007 IEEE Congress on Evolutionary Computation.

[3]  L.M. Waghmare,et al.  Neural Network Weight Initialization , 2007, 2007 International Conference on Mechatronics and Automation.

[4]  Bernard Widrow,et al.  Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[5]  Myung Won Kim,et al.  The effect of initial weights on premature saturation in back-propagation learning , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[6]  R. Saravanan,et al.  A genetic algorithm-based artificial neural network model for the optimization of machining processes , 2009, Neural Computing and Applications.

[7]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[8]  Chii-Wann Lin,et al.  Partial least-squares algorithm for weights initialization of backpropagation network , 2003, Neurocomputing.

[9]  Wan-Chi Siu,et al.  An independent component analysis based weight initialization method for multilayer perceptrons , 2002, Neurocomputing.

[10]  Katsuyuki Hagiwara,et al.  On the problem of applying AIC to determine the structure of a layered feedforward neural network , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[11]  Filomena Ferrucci,et al.  A Genetic Algorithm to Configure Support Vector Machines for Predicting Fault-Prone Components , 2011, PROFES.

[12]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[13]  Maysam F. Abbod,et al.  Optimization the Initial Weights of Artificial Neural Networks via Genetic Algorithm Applied to Hip Bone Fracture Prediction , 2012, Adv. Fuzzy Syst..

[14]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[15]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[16]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[17]  Maysam F. Abbod,et al.  Neural Networks Initial Weights Optimisation , 2010, 2010 12th International Conference on Computer Modelling and Simulation.

[18]  Tommy W. S. Chow,et al.  A weight initialization method for improving training speed in feedforward neural network , 2000, Neurocomputing.

[19]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[20]  Michèle Sebag,et al.  Collaborative hyperparameter tuning , 2013, ICML.

[21]  Thomas Childs Cochran I. The Problem , 1959 .

[22]  Douglas M. Hawkins,et al.  The Problem of Overfitting , 2004, J. Chem. Inf. Model..

[23]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[24]  David D. Cox,et al.  A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation , 2009, PLoS Comput. Biol..

[25]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[26]  Chuan-Sheng Foo,et al.  Efficient multiple hyperparameter learning for log-linear models , 2007, NIPS.

[27]  Javier Bilbao,et al.  Selective production of olefins from bioethanol on HZSM-5 zeolite catalysts treated with NaOH , 2010 .

[28]  Gaurang Panchal,et al.  Searching Most Efficient Neural Network Architecture Using Akaike's Information Criterion (AIC) , 2010 .

[29]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[30]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[31]  Hak-Keung Lam,et al.  Tuning of the structure and parameters of a neural network using an improved genetic algorithm , 2003, IEEE Trans. Neural Networks.

[32]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[33]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[34]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..