Stabilization and speedup of convergence in training feedforward neural networks

Abstract We review the training problem for feedforward neural networks and discuss various techniques for accelerating and stabilizing the convergence during training. Among other techniques, these include a self-adjusting step gain, bipolar sigmoid activation functions, training on all classes in parallel, adjusting the exponential rates in the sigmoids, bounding the sigmoid derivatives away from zero, training on exemplars to which noise has been added, adjusting the initial weight set to a subdomain of low values of the sum-squared error, and adjusting the momentum coefficient over the iterations. We also examine methods to assure the generalization of the learning, which include the pruning of unimportant weights and adding noise to exemplars for training.

[1]  Kiyotoshi Matsuoka,et al.  Noise injection into inputs in back-propagation learning , 1992, IEEE Trans. Syst. Man Cybern..

[2]  Vimal Singh,et al.  IEEE transactions on systems, man and cybernetics. Part B, Cybernetics , 1996 .

[3]  S Mangrulkar,et al.  Artificial neural systems. , 1990, ISA transactions.

[4]  Maureen Caudill,et al.  Neural network training tips and techniques , 1991 .

[5]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[6]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[7]  Bart Kosko,et al.  Neural networks and fuzzy systems , 1998 .

[8]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[9]  Jacek M. Zurada Lambda learning rule for feedforward neural networks , 1993, IEEE International Conference on Neural Networks.

[10]  Alexander N. Gorban,et al.  Internal conflicts in neural networks , 1992, [Proceedings] 1992 RNNS/IEEE Symposium on Neuroinformatics and Neurocomputers.

[11]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[12]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[13]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[14]  Yoh-Han Pao,et al.  Adaptive pattern recognition and neural networks , 1989 .

[15]  W. C. Miller,et al.  A new acceleration technique for the backpropagation algorithm , 1993, IEEE International Conference on Neural Networks.

[16]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[17]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.