Evolutionary training of hardware realizable multilayer perceptrons

The use of multilayer perceptrons (MLP) with threshold functions (binary step function activations) greatly reduces the complexity of the hardware implementation of neural networks, provides tolerance to noise and improves the interpretation of the internal representations. In certain case, such as in learning stationary tasks, it may be sufficient to find appropriate weights for an MLP with threshold activation functions by software simulation and, then, transfer the weight values to the hardware implementation. Efficient training of these networks is a subject of considerable ongoing research. Methods available in the literature mainly focus on two-state (threshold) nodes and try to train the networks by approximating the gradient of the error function and modifying appropriately the gradient descent, or by progressively altering the shape of the activation functions. In this paper, we propose an evolution-motivated approach, which is eminently suitable for networks with threshold functions and compare its performance with four other methods. The proposed evolutionary strategy does not need gradient related information, it is applicable to a situation where threshold activations are used from the beginning of the training, as in “on-chip” training, and is able to train networks with integer weights.

[1]  George D. Magoulas,et al.  Training multilayer networks with discrete activation functions , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[2]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[3]  Padhraic Smyth,et al.  Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.

[4]  G. J. Gibson,et al.  On the decision regions of multilayer perceptrons , 1990, Proc. IEEE.

[5]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[6]  Bernard Widrow,et al.  Neural nets for adaptive filtering and adaptive pattern recognition , 1988, Computer.

[7]  Masayoshi Tomizuka,et al.  Modeling and conventional/adaptive PI control of a Lathe cutting process , 1988 .

[8]  Vassilis P. Plagianakos,et al.  Parallel evolutionary training algorithms for “hardware-friendly” neural networks , 2002, Natural Computing.

[9]  Dennis J. Volper,et al.  Representing and learning Boolean functions of multivalued features , 1990, IEEE Trans. Syst. Man Cybern..

[10]  Zheng Zeng,et al.  A learning algorithm for multi-layer perceptrons with hard-limiting threshold units , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[11]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[12]  Robert E. King Computational intelligence in control engineering , 1999 .

[13]  Peter L. Bartlett,et al.  Using random weights to train multilayer networks of hard-limiting units , 1992, IEEE Trans. Neural Networks.

[14]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[15]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[16]  George D. Magoulas,et al.  A training method for discrete multilayer neural networks , 1997 .

[17]  Vassilis P. Plagianakos,et al.  Neural network training with constrained integer weights , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[18]  Altaf Hamid Khan,et al.  Feedforward neural networks with constrained weights , 1996 .

[19]  Vassilis P. Plagianakos,et al.  Integer weight training by differential evolution algorithms , 1998 .

[20]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[21]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[22]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[23]  D. J. Toms,et al.  Training binary node feedforward neural networks by back propagation of error , 1990 .

[24]  Thomas Bäck,et al.  An Overview of Evolutionary Algorithms for Parameter Optimization , 1993, Evolutionary Computation.

[25]  Derong Liu,et al.  Solving the N-bit parity problem using neural networks , 1999, Neural Networks.

[26]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[27]  Antonette M. Logar,et al.  An iterative method for training multilayer networks with threshold functions , 1994, IEEE Trans. Neural Networks.

[28]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .