A constructive algorithm to synthesize arbitrarily connected feedforward neural networks

In this work we present a constructive algorithm capable of producing arbitrarily connected feedforward neural network architectures for classification problems. Architecture and synaptic weights of the neural network should be defined by the learning procedure. The main purpose is to obtain a parsimonious neural network, in the form of a hybrid and dedicate linear/nonlinear classification model, which can guide to high levels of performance in terms of generalization. Though not being a global optimization algorithm, nor a population-based metaheuristics, the constructive approach has mechanisms to avoid premature convergence, by mixing growing and pruning processes, and also by implementing a relaxation strategy for the learning error. The synaptic weights of the neural networks produced by the constructive mechanism are adjusted by a quasi-Newton method, and the decision to grow or prune the current network is based on a mutual information criterion. A set of benchmark experiments, including artificial and real datasets, indicates that the new proposal presents a favorable performance when compared with alternative approaches in the literature, such as traditional MLP, mixture of heterogeneous experts, cascade correlation networks and an evolutionary programming system, in terms of both classification accuracy and parsimony of the obtained classifier.

[1]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[2]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[3]  Shmuel S. Oren,et al.  On the selection of parameters in Self Scaling Variable Metric Algorithms , 1974, Math. Program..

[4]  Vladimir Cherkassky,et al.  Learning from Data: Concepts, Theory, and Methods , 1998 .

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  Peter J. Angeline,et al.  An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[7]  Stephen I. Gallant,et al.  Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.

[8]  Les E. Atlas,et al.  Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.

[9]  Stephen I. Gallant,et al.  Neural network learning and expert systems , 1993 .

[10]  J. D. Schaffer,et al.  Combinations of genetic algorithms and neural networks: a survey of the state of the art , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[11]  Jukka Saarinen,et al.  Evaluation of constructive neural networks with cascaded architectures , 2002, Neurocomputing.

[12]  Peter J. B. Hancock,et al.  Genetic algorithms and permutation problems: a comparison of recombination operators for neural net structure specification , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[13]  A. C. Graves,et al.  A Method for Measuring Half-Lives , 1947 .

[14]  E. Polak Introduction to linear and nonlinear programming , 1973 .

[15]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[16]  Fernando José Von Zuben,et al.  Evolving Arbitrarily Connected Feedforward Neural Networks via Genetic Algorithms , 2010, 2010 Eleventh Brazilian Symposium on Neural Networks.

[17]  Leonardo Franco,et al.  Constructive Neural Networks , 2009, Constructive Neural Networks.

[18]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[19]  Marcus Frean,et al.  The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[20]  E. Fiesler,et al.  Comparative Bibliography of Ontogenic Neural Networks , 1994 .

[21]  David B. Fogel,et al.  Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .

[22]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[23]  Sandro Ridella,et al.  On the convergence of a growing topology neural algorithm , 1996, Neurocomputing.

[24]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[25]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[26]  H. S. M. Beigi,et al.  Learning algorithms for neural networks based on Quasi-Newton methods with self-scaling , 1993 .

[27]  Mokhtar S. Bazaraa,et al.  Nonlinear Programming: Theory and Algorithms , 1993 .

[28]  B.M. Wilamowski,et al.  Method of computing gradient vector and Jacobean matrix in arbitrarily connected neural networks , 2007, 2007 IEEE International Symposium on Industrial Electronics.

[29]  P. Rapp,et al.  Statistical validation of mutual information calculations: comparison of alternative numerical algorithms. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Okyay Kaynak,et al.  Computing Gradient Vector and Jacobian Matrix in Arbitrarily Connected Neural Networks , 2008, IEEE Transactions on Industrial Electronics.

[31]  Kagan Tumer,et al.  Structural adaptation and generalization in supervised feed-forward networks , 1994 .

[32]  Jacques de Villiers,et al.  Backpropagation neural nets with one and two hidden layers , 1993, IEEE Trans. Neural Networks.

[33]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[34]  Anthony S. Maida,et al.  Performance of generalized multilayered perceptons trained using the Levenberg-Marquardt method , 2009 .

[35]  Neil Burgess,et al.  A Constructive Algorithm that Converges for Real-Valued Input Patterns , 1994, Int. J. Neural Syst..

[36]  Patrick van der Smagt Minimisation methods for training feedforward neural networks , 1994, Neural Networks.

[37]  Edoardo Amaldi,et al.  Two Constructive Methods for Designing Compact Feedforward Networks of Threshold Units , 1997, Int. J. Neural Syst..

[38]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[39]  Xin Yao,et al.  A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[40]  Vittorio Maniezzo,et al.  Genetic evolution of the topology and weight distribution of neural networks , 1994, IEEE Trans. Neural Networks.

[41]  Singiresu S. Rao Engineering Optimization : Theory and Practice , 2010 .