FFANN Weight Initialization: A New Method

In this paper a new weight initialization scheme is proposed. The proposed scheme, distributes the biases / thresholds at a equal intervals in an interval $(-\lambda, \lambda)$; while the weights are distributed uniformly in the same interval. The value of the $\lambda$ is chosen such that the expected value of the net inputs is 0 while the variance is 1. On a set of 10 tasks (5 function approximation and 5 real life benchmark regression tasks), the proposed weight initialization routine was compared to three existing weight initialization routine. From the results obtained we may infer that the proposed weight initialization is almost always better than these three existing routines and is never worse on the basis of generalization performance (result over the test data set).

[1]  Thomas F. Brooks,et al.  Airfoil self-noise and prediction , 1989 .

[2]  Egbert J. W. Boers,et al.  Biological metaphors and the design of modular artificial neural networks , 2010 .

[3]  Roberto Todeschini,et al.  Prediction of Acute Aquatic Toxicity toward Daphnia Magna by using the GA-kNN Method , 2014, Alternatives to laboratory animals : ATLA.

[4]  Subhabrata Chakraborti,et al.  Nonparametric Statistical Inference , 2011, International Encyclopedia of Statistical Science.

[5]  Martin A. Riedmiller,et al.  Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms , 1994 .

[6]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[7]  K. Hamidieh A data-driven statistical model for predicting the critical temperature of a superconductor , 2018, Computational Materials Science.

[8]  Pravin Chandra,et al.  Bi-modal derivative adaptive activation function sigmoidal feedforward artificial neural networks , 2017, Appl. Soft Comput..

[9]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[10]  Jenq-Neng Hwang,et al.  Projection pursuit learning networks for regression , 1990, [1990] Proceedings of the 2nd International IEEE Conference on Tools for Artificial Intelligence.

[11]  Christian Igel,et al.  Empirical evaluation of the improved Rprop learning algorithms , 2003, Neurocomputing.

[12]  John F. Kolen,et al.  Backpropagation is Sensitive to Initial Conditions , 1990, Complex Syst..

[13]  Athanasios Tsanas,et al.  Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools , 2012 .

[14]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[15]  Vladimir Cherkassky,et al.  Comparison of adaptive methods for function estimation from samples , 1996, IEEE Trans. Neural Networks.

[16]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[17]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[18]  Allan Pinkus,et al.  Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[19]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[20]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[21]  Pravin Chandra,et al.  Bi-modal derivative activation function for sigmoidal feedforward networks , 2014, Neurocomputing.

[22]  R. Tibshirani,et al.  The II P method for estimating multivariate functions from noisy data , 1991 .