Simultaneous Evolution of Structure and Activation Function Types in Generalized Multi-Layer Perceptrons

The most common (or even only) choice of activation functio ns for multi–layer perceptrons (MLPs) widely used in research, engineering and business is the log istic function. Among the reasons for this popularity are its boundedness in the unit interval, the function’s and its erivative’s fast computability, and a number of amenable mathematical properties in the realm of approximation theo ry. However, considering the huge variety of problem domains MLPs are applied in, it is intriguing to suspect that specific problems call for single or a set of specific activation functions. Also, biological neural networks (B NNs) with their enormous variety of neurons mastering a set of complex tasks may be considered to motivate this hypot esis. We present a number of experiments evolving structure and activation function types (AFTs) of generali zed multi–layer perceptrons (GMLPs) using the parallel netGEN system to train the evolved architectures. The numbe r of network parameters subjected to evolution is increased in various steps from learning parameters only fo r a GMLP of fixed architecture to simultaneous evolution of structure and activation function types. For experiment al comparisons we utilize a synthetic and a real–world classification problem, and a chaotic time series predictio n task. Key–Words:Multi–Layer Perceptrons, Activation Functions, Evoluti onary Algorithms, Classification, Prediction

[1]  J. Sopena,et al.  Neural networks with periodic and monotonic activation functions: a comparative study in classification problems , 1999 .

[2]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[3]  Peter M. Todd,et al.  Designing Neural Networks using Genetic Algorithms , 1989, ICGA.

[4]  Michael I. Jordan Why the logistic function? A tutorial discussion on probabilities and neural networks , 1995 .

[5]  H. A. Mayer,et al.  ptGAsGenetic Algorithms Evolving Noncoding Segments by Means of Promoter/Terminator Sequences , 1998, Evolutionary Computation.

[6]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[7]  Reinhold Huber,et al.  On the Role of Regularization Parameters in Fitness Functions for Evolutionary Designed Artificial N , 1996 .

[8]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[9]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[10]  Georg Schnitger,et al.  The Power of Approximation: A Comparison of Activation Functions , 1992, NIPS.

[11]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[12]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[13]  Helmut A. Mayer,et al.  ptGAsGenetic Algorithms Evolving Noncoding Segments by Means of Promoter/Terminator Sequences , 1998, Evolutionary Computation.

[14]  Reinhold Huber,et al.  netGEN - A Parallel System Generating Problem-Adapted Topologies of Artificial Neural Networks by Means of Genetic Algorithms , 1995 .

[15]  Xin Yao,et al.  Evolutionary design of artificial neural networks with different nodes , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[16]  C. Miall,et al.  The diversity of neuronal properties , 1989 .

[17]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..