The Interchangeability of Learning Rate and Gain in Backpropagation Neural Networks

The backpropagation algorithm is widely used for training multilayer neural networks. In this publication the gain of its activation function(s) is investigated. In specific, it is proven that changing the gain of the activation function is equivalent to changing the learning rate and the weights. This simplifies the backpropagation learning rule by eliminating one of its parameters. The theorem can be extended to hold for some well-known variations on the backpropagation algorithm, such as using a momentum term, flat spot elimination, or adaptive gain. Furthermore, it is successfully applied to compensate for the nonstandard gain of optical sigmoids for optical neural networks.

[1]  Kazushi Ikeda,et al.  Learning and generalization in neural networks , 1995 .

[2]  P D Moerland,et al.  Incorporation of liquid-crystal light valve nonlinearities in optical multilayer neural networks. , 1996, Applied optics.

[3]  Emile Fiesler,et al.  Weight Initialization for High Order and Multilayer Perceptrons , 1994 .

[4]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[5]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[6]  Indu Saxena,et al.  Adaptive Multilayer Optical Neural Network with Optical Thresholding , 1995 .

[7]  Alessandro Sperduti,et al.  Speed up learning and network optimization with extended back propagation , 1993, Neural Networks.

[8]  E. FieslerIDIAP,et al.  Adaptive Multilayer Optical Neural Network with Optical Thresholding , 1995 .

[9]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[10]  Qi Jia,et al.  Equivalence relation between the back propagation learning process of an FNN and that of an FNNG , 1994, Neural Networks.

[11]  Antonette M. Logar,et al.  An iterative method for training multilayer networks with threshold functions , 1994, IEEE Trans. Neural Networks.

[12]  M. F. Tenorio,et al.  Adaptive gain networks , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[13]  Emile Fiesler,et al.  Neural network classification and formalization , 1994 .

[14]  Leon N. Cooper,et al.  Learning and generalization in neural networks , 1990 .

[15]  Alex Pentland,et al.  Analysis of Neural Networks with Redundancy , 1990, Neural Computation.

[16]  Geoffrey E. Hinton,et al.  Experiments on Learning by Back Propagation. , 1986 .

[17]  H. John Caulfield,et al.  Weight discretization paradigm for optical neural networks , 1990, Other Conferences.

[18]  L. W. Massengill,et al.  Threshold non-linearity effects on weight-decay tolerance in analog neural networks , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[19]  Martin Brown,et al.  Neurofuzzy adaptive modelling and control , 1994 .

[20]  Hong Wang,et al.  How Biased is Your Multi-Layered Perceptron? , 1993 .

[21]  W. C. Miller,et al.  Training hard-limiting neurons using back-propagation algorithm by updating steepness factors , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[22]  Etienne Barnard,et al.  Avoiding false local minima by proper initialization of connections , 1992, IEEE Trans. Neural Networks.

[23]  Richard W. Conners,et al.  Fast Back-Propagation Learning Using Steep Activation Functions and Automatic Weight , 1992 .

[24]  Tai-Hoon Cho,et al.  Fast backpropagation learning using steep activation functions and automatic weight reinitialization , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[25]  Emile Fiesler,et al.  High-order and multilayer perceptron initialization , 1997, IEEE Trans. Neural Networks.

[26]  Javier R. Movellan,et al.  Benefits of gain: speeded learning and minimal hidden layers in back-propagation networks , 1991, IEEE Trans. Syst. Man Cybern..

[27]  E. Fiesler,et al.  A weight discretization paradigm for optical neural networks 0 , 1990 .