An adaptive nonmonotone active set - weight constrained - neural network training algorithm

Abstract In this work, a new direction for improving the classification accuracy of artificial neural networks is proposed by bounding the weights of the network, during the training process. Furthermore, a new adaptive nonmonotone active set – weight constrained – neural network training algorithm is proposed in order to demonstrate the efficacy and efficiency of our approach. The proposed training algorithm consists of two phases: a gradient projection phase which utilizes an adaptive nonmonotone line search and an unconstrained optimization phase which exploits the box structure of the bounds. Also, a set of switching criteria is defined for efficiently switching between the two phases. Our preliminary numerical experiments illustrate that the classification efficiency of the proposed algorithm outperforms classical neural network training algorithms, providing empirical evidence that it provides more stable, efficient and reliable learning.

[1]  Panayiotis E. Pintelas,et al.  An Improved spectral conjugate Gradient Neural Network Training Algorithm , 2012, Int. J. Artif. Intell. Tools.

[2]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[3]  Panayiotis E. Pintelas,et al.  A new class of nonmonotone conjugate gradient training algorithms , 2015, Appl. Math. Comput..

[4]  William W. Hager,et al.  A New Active Set Algorithm for Box Constrained Optimization , 2006, SIAM J. Optim..

[5]  L. Grippo,et al.  A nonmonotone line search technique for Newton's method , 1986 .

[6]  William W. Hager,et al.  A New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search , 2005, SIAM J. Optim..

[7]  George D. Magoulas,et al.  New globally convergent training scheme based on the resilient propagation algorithm , 2005, Neurocomputing.

[8]  Peyman Abbaszadeh,et al.  A new hybrid artificial neural networks for rainfall-runoff process modeling , 2013, Neurocomputing.

[9]  Wenyu Sun,et al.  Global convergence of nonmonotone descent methods for unconstrained optimization problems , 2002 .

[10]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[11]  Jian Wang,et al.  A novel conjugate gradient method with generalized Armijo search for efficient training of feedforward neural networks , 2018, Neurocomputing.

[12]  Jacek M. Zurada,et al.  Deterministic convergence of conjugate gradient method for feedforward neural networks , 2011, Neurocomputing.

[13]  H. Bülthoff,et al.  Using neuropharmacology to distinguish between excitatory and inhibitory movement detection mechanisms in the fly Calliphora erythrocephala , 1988, Biological Cybernetics.

[14]  Jorge J. Moré,et al.  Digital Object Identifier (DOI) 10.1007/s101070100263 , 2001 .

[15]  Ioannis E. Livieris,et al.  Improving the Classification Efficiency of an ANN Utilizing a New Training Methodology , 2018, Informatics.

[16]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[17]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[18]  George D. Magoulas,et al.  Nonmonotone Levenberg–Marquardt training of recurrent neural architectures for processing symbolic sequences , 2011, Neural Computing and Applications.

[19]  Martin T. Hagan,et al.  Neural network design , 1995 .

[20]  Chun-Cheng Peng,et al.  Adaptive Nonmonotone Conjugate Gradient Training Algorithm for Recurrent Neural Networks , 2007 .

[21]  W. Hager,et al.  The cyclic Barzilai-–Borwein method for unconstrained optimization , 2006 .

[22]  Ilias Maglogiannis,et al.  Neural network-based diagnostic and prognostic estimations in breast cancer microscopic instances , 2006, Medical and Biological Engineering and Computing.

[23]  George D. Magoulas,et al.  Deterministic nonmonotone strategies for effective training of multilayer perceptrons , 2002, IEEE Trans. Neural Networks.

[24]  Qiang Liu,et al.  Fast Neural Network Training on FPGA Using Quasi-Newton Optimization Method , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[25]  Kok Lay Teo,et al.  A new Lyapunov functional approach to sampled-data synchronization control for delayed neural networks , 2018, J. Frankl. Inst..

[26]  Jimi Tjong,et al.  Artificial neural network training utilizing the smooth variable structure filter estimation strategy , 2016, Neural Computing and Applications.

[27]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[28]  Vijanth S. Asirvadam,et al.  A memory optimal BFGS neural network training algorithm , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[29]  Qing-Long Han,et al.  Admissible Delay Upper Bounds for Global Asymptotic Stability of Neural Networks With Time-Varying Delays , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[31]  George D. Magoulas,et al.  Nonmonotone BFGS-trained recurrent neural networks for temporal sequence processing , 2011, Appl. Math. Comput..

[32]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[33]  Panayiotis E. Pintelas,et al.  A new conjugate gradient algorithm for training neural networks based on a modified secant equation , 2013, Appl. Math. Comput..

[34]  Maciej Lawrynczuk,et al.  Training of neural models for predictive control , 2010, Neurocomputing.

[35]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[36]  Derui Ding,et al.  An overview of recent developments in Lyapunov-Krasovskii functionals and stability criteria for recurrent neural networks with time-varying delays , 2018, Neurocomputing.

[37]  CHUN-CHENG PENG,et al.  Advanced Adaptive Nonmonotone Conjugate Gradient Training Algorithm for Recurrent Neural Networks , 2008, Int. J. Artif. Intell. Tools.

[38]  Hasan Badem,et al.  A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited-memory BFGS optimization algorithms , 2017, Neurocomputing.

[39]  Indranil Saha,et al.  journal homepage: www.elsevier.com/locate/neucom , 2022 .

[40]  Lifeng Xi,et al.  Evolving artificial neural networks using an improved PSO and DPSO , 2008, Neurocomputing.

[41]  Bernard Widrow,et al.  Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[42]  Monica Riley,et al.  Physiological genomics of Escherichia coli protein families. , 2002, Physiological genomics.