Convergence of a modified gradient-based learning algorithm with penalty for single-hidden-layer feed-forward networks

Based on a novel algorithm, known as the upper-layer-solution-aware (USA), a new algorithm, in which the penalty method is introduced into the empirical risk, is studied for training feed-forward neural networks in this paper, named as USA with penalty. Both theoretical analysis and numerical results show that it can control the magnitude of weights of the networks. Moreover, the deterministic theoretical analysis of the new algorithm is proved. The monotonicity of the empirical risk with penalty term is guaranteed in the training procedure. The weak and strong convergence results indicate that the gradient of the total error function with respect to weights tends to zero, and the weight sequence goes to a fixed point when the iterations approach positive infinity. Numerical experiment has been implemented and effectively verifies the proved theoretical results.

[1]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[2]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[3]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[4]  Yi Du,et al.  Anomaly detection in traffic using L1-norm minimization extreme learning machine , 2015, Neurocomputing.

[5]  C. Siew,et al.  Extreme Learning Machine with Randomly Assigned RBF Kernels , 2005 .

[6]  Zhiping Lin,et al.  Self-Adaptive Evolutionary Extreme Learning Machine , 2012, Neural Processing Letters.

[7]  A. Kai Qin,et al.  Evolutionary extreme learning machine , 2005, Pattern Recognit..

[8]  Haruhiko Takase,et al.  Effect of regularization term upon fault tolerant training , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[9]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[10]  Dong Yu,et al.  Efficient and effective algorithms for training single-hidden-layer neural networks , 2012, Pattern Recognit. Lett..

[11]  Jun Miao,et al.  Constrained Extreme Learning Machine: A novel highly discriminative random feedforward neural network , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[14]  Chee Kheong Siew,et al.  Extreme learning machine: RBF network case , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[15]  Lei Chen,et al.  Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[16]  Dong Yu,et al.  Speech emotion recognition using deep neural network and extreme learning machine , 2014, INTERSPEECH.

[17]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[18]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[19]  Arthur E. Hoerl,et al.  Application of ridge analysis to regression problems , 1962 .

[20]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[21]  Yonggwan Won,et al.  An Improvement of Extreme Learning Machine for Compact Single-Hidden-Layer Feedforward Neural Networks , 2008, Int. J. Neural Syst..

[22]  Eric A. Wan,et al.  Neural network classification: a Bayesian interpretation , 1990, IEEE Trans. Neural Networks.

[23]  Jun Yao,et al.  PARAMETER PREDICTION OF HYDRAULIC FRACTURE FOR TIGHT RESERVOIR BASED ON MICRO-SEISMIC AND HISTORY MATCHING , 2018 .

[24]  Masumi Ishikawa,et al.  Structural learning with forgetting , 1996, Neural Networks.

[25]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[26]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[27]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[28]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[29]  Russell C. Eberhart,et al.  Neural network concepts and paradigms , 2007 .

[30]  Ya-Xiang Yuan,et al.  Optimization theory and methods , 2006 .

[31]  Fei Han,et al.  An improved evolutionary extreme learning machine based on particle swarm optimization , 2013, Neurocomputing.

[32]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[33]  J H Goodband,et al.  A comparison of neural network approaches for on-line prediction in IGRT. , 2008, Medical physics.

[34]  Han Zhao,et al.  Extreme learning machine: algorithm, theory and applications , 2013, Artificial Intelligence Review.

[35]  Narasimhan Sundararajan,et al.  Fully complex extreme learning machine , 2005, Neurocomputing.

[36]  A Tikhonov,et al.  Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[37]  Rudy Setiono,et al.  A Penalty-Function Approach for Pruning Feedforward Neural Networks , 1997, Neural Computation.