The annealing robust backpropagation (ARBP) learning algorithm

Multilayer feedforward neural networks are often referred to as universal approximators. Nevertheless, if the used training data are corrupted by large noise, such as outliers, traditional backpropagation learning schemes may not always come up with acceptable performance. Even though various robust learning algorithms have been proposed in the literature, those approaches still suffer from the initialization problem. In those robust learning algorithms, the so-called M-estimator is employed. For the M-estimation type of learning algorithms, the loss function is used to play the role in discriminating against outliers from the majority by degrading the effects of those outliers in learning. However, the loss function used in those algorithms may not correctly discriminate against those outliers. In this paper, the annealing robust backpropagation learning algorithm (ARBP) that adopts the annealing concept into the robust learning algorithms is proposed to deal with the problem of modeling under the existence of outliers. The proposed algorithm has been employed in various examples. Those results all demonstrated the superiority over other robust learning algorithms independent of outliers. In the paper, not only is the annealing concept adopted into the robust learning algorithms but also the annealing schedule k/t was found experimentally to achieve the best performance among other annealing schedules, where k is a constant and is the epoch number.

[1]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[2]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[3]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[4]  Bart Kosko,et al.  Neural networks and fuzzy systems: a dynamical systems approach to machine intelligence , 1991 .

[5]  S. Grossberg,et al.  Pattern Recognition by Self-Organizing Neural Networks , 1991 .

[6]  Petri Koistinen,et al.  Using additive noise in back-propagation training , 1992, IEEE Trans. Neural Networks.

[7]  Andrzej Cichocki,et al.  Neural networks for optimization and signal processing , 1993 .

[8]  Murray Smith,et al.  Neural Networks for Statistical Modeling , 1993 .

[9]  Ramesh C. Jain,et al.  A robust backpropagation learning algorithm for function approximation , 1994, IEEE Trans. Neural Networks.

[10]  Les E. Atlas,et al.  Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.

[11]  Stan Z. Li,et al.  Robustizing robust M-estimation using deterministic annealing , 1996, Pattern Recognit..

[12]  N. S. Vichare,et al.  Robustification of the least absolute value estimator by means of projection statistics [power syste , 1996 .

[13]  Peter L. Bartlett,et al.  For Valid Generalization the Size of the Weights is More Important than the Size of the Network , 1996, NIPS.

[14]  Kadir Liano,et al.  Robust error measure for supervised neural network learning with outliers , 1996, IEEE Trans. Neural Networks.

[15]  Chuan Wang,et al.  Cost function for robust estimation of PCA , 1996, Defense + Commercial Sensing.

[16]  A. Wallace,et al.  Outlier removal and discontinuity preserving smoothing of range data , 1996 .

[17]  Fredric M. Ham,et al.  Robust partial least-squares regression: a modular neural network approach , 1997, Defense, Security, and Sensing.

[18]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[19]  Lei Huang,et al.  Robust interval regression analysis using neural networks , 1998, Fuzzy Sets Syst..

[20]  Wu Li,et al.  The Linear l1 Estimator and the Huber M-Estimator , 1998, SIAM J. Optim..