Path relinking and GRG for artificial neural networks

Artificial neural networks (ANN) have been widely used for both classification and prediction. This paper is focused on the prediction problem in which an unknown function is approximated. ANNs can be viewed as models of real systems, built by tuning parameters known as weights. In training the net, the problem is to find the weights that optimize its performance (i.e., to minimize the error over the training set). Although the most popular method for training these networks is back propagation, other optimization methods such as tabu search or scatter search have been successfully applied to solve this problem. In this paper we propose a path relinking implementation to solve the neural network training problem. Our method uses GRG, a gradient-based local NLP solver, as an improvement phase, while previous approaches used simpler local optimizers. The experimentation shows that the proposed procedure can compete with the best-known algorithms in terms of solution quality, consuming a reasonable computational effort.

[1]  M. Laguna,et al.  Neural network prediction in a system for optimizing simulations , 2002 .

[2]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[3]  Jirí Benes,et al.  On neural networks , 1990, Kybernetika.

[4]  Elijah Polak,et al.  Computational methods in optimization , 1971 .

[5]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[6]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[7]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[8]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[9]  Bahram Alidaee,et al.  Global optimization for artificial neural networks: A tabu search application , 1998, Eur. J. Oper. Res..

[10]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[11]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[12]  Halbert White,et al.  On learning the derivatives of an unknown mapping with multilayer feedforward networks , 1992, Neural Networks.

[13]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[14]  Randall S. Sexton,et al.  Optimization of neural networks: A comparative analysis of the genetic algorithm and simulated annealing , 1999, Eur. J. Oper. Res..

[15]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[16]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[17]  S. Nash,et al.  Linear and Nonlinear Programming , 1987 .

[18]  William H. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[19]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[20]  Rafael Martí,et al.  Multilayer neural networks: an experimental evaluation of on-line training methods , 2004, Comput. Oper. Res..

[21]  V. Tikhomirov On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of one Variable and Addition , 1991 .

[22]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[23]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[24]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[25]  Mordecai Avriel,et al.  Nonlinear programming , 1976 .

[26]  R. Brent Table errata: Algorithms for minimization without derivatives (Prentice-Hall, Englewood Cliffs, N. J., 1973) , 1975 .

[27]  Fred W. Glover,et al.  Principles of scatter search , 2006, Eur. J. Oper. Res..

[28]  Stuart Smith,et al.  Solving Large Sparse Nonlinear Programs Using GRG , 1992, INFORMS J. Comput..

[29]  Edward K. Blum,et al.  Approximation theory and feedforward networks , 1991, Neural Networks.

[30]  Rafael Martí,et al.  Scatter Search: Diseño Básico y Estrategias avanzadas , 2002, Inteligencia Artif..

[31]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.