Training multilayer networks with discrete activation functions

Efficient training of multilayer networks with discrete activation functions is a subject of considerable ongoing research. The use of these networks greatly reduces the complexity of the hardware implementation, provides tolerance to noise and improves the interpretation of the internal representations. Methods available in the literature mainly focus on two-state (binary) nodes and try to train these networks by approximating the gradient and modifying appropriately the gradient descent. However, they exhibit slow convergence speed and low possibility of success compared to networks with continuous activations. In this work, we propose an evolution-motivated approach, which is eminently suitable for networks with discrete output states and compare its performance with four other methods.

[1]  Zheng Zeng,et al.  A learning algorithm for multi-layer perceptrons with hard-limiting threshold units , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[2]  Antonette M. Logar,et al.  An iterative method for training multilayer networks with threshold functions , 1994, IEEE Trans. Neural Networks.

[3]  D. J. Toms,et al.  Training binary node feedforward neural networks by back propagation of error , 1990 .

[4]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[5]  Vassilis P. Plagianakos,et al.  Neural network training with constrained integer weights , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[6]  Bernard Widrow,et al.  Neural nets for adaptive filtering and adaptive pattern recognition , 1988, Computer.

[7]  George D. Magoulas,et al.  A training method for discrete multilayer neural networks , 1997 .

[8]  Peter L. Bartlett,et al.  Using random weights to train multilayer networks of hard-limiting units , 1992, IEEE Trans. Neural Networks.

[9]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[10]  Dennis J. Volper,et al.  Representing and learning Boolean functions of multivalued features , 1990, IEEE Trans. Syst. Man Cybern..

[11]  Padhraic Smyth,et al.  Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.

[12]  G. J. Gibson,et al.  On the decision regions of multilayer perceptrons , 1990, Proc. IEEE.

[13]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[14]  Masayoshi Tomizuka,et al.  Modeling and conventional/adaptive PI control of a Lathe cutting process , 1988 .

[15]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[16]  Vassilis P. Plagianakos,et al.  Training neural networks with threshold activation functions and constrained integer weights , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.