Research on three-step accelerated gradient algorithm in deep learning
暂无分享,去创建一个
[1] W. Pitts,et al. A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.
[2] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[3] Guanghui Lan,et al. Optimal Adaptive and Accelerated Stochastic Gradient Descent , 2018, ArXiv.
[4] Kurt Hornik,et al. Support Vector Machines in R , 2006 .
[5] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[6] Stefan Fritsch,et al. Training of Neural Networks [R package neuralnet version 1.44.2] , 2019 .
[7] B. V. Shah,et al. Some Algorithms for Minimizing a Function of Several Variables , 1964 .
[8] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[9] M.N. Vrahatis,et al. Parallel tangent methods with variable stepsize , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).
[10] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[11] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[12] Kurt Hornik,et al. Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien [R package e1071 version 1.7-4] , 2020 .
[13] Kevin K. Chen,et al. The Upper Bound on Knots in Neural Networks , 2016, ArXiv.
[14] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[15] H. Borchers. Practical Numerical Math Functions [R package pracma version 2.2.9] , 2019 .
[16] Ronald Davis,et al. Neural networks and deep learning , 2017 .
[17] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[18] Guanghui Lan. Convex optimization under inexact first-order information , 2009 .
[19] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[20] L. Armijo. Minimization of functions having Lipschitz continuous first partial derivatives. , 1966 .
[21] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[22] Emile Fiesler,et al. Neural Networks with Adaptive Learning Rate and Momentum Terms , 1995 .
[23] Guanghui Lan,et al. An optimal method for stochastic composite optimization , 2011, Mathematical Programming.
[24] Claus Nebauer,et al. Evaluation of convolutional neural networks for visual recognition , 1998, IEEE Trans. Neural Networks.
[25] George D. Magoulas,et al. Effective Backpropagation Training with Variable Stepsize , 1997, Neural Networks.