SPI-Optimizer: An Integral-Separated PI Controller for Stochastic Optimization
暂无分享,去创建一个
Dan Wang | Yong Wang | Lu Fang | Haoqian Wang | Mengqi Ji | Lu Fang | Haoqian Wang | Mengqi Ji | Dan Wang | Yong Wang
[1] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[2] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[3] Lu Fang,et al. SurfaceNet: An End-to-End 3D Neural Network for Multiview Stereopsis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[4] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[5] Katsuhiko Ogata,et al. Discrete-time control systems , 1987 .
[6] P. Sunthar,et al. The generalized proportional-integral-derivative (PID) gradient descent back propagation algorithm , 1995, Neural Networks.
[7] Lars Rundqwist,et al. Integrator Windup and How to Avoid It , 1989, 1989 American Control Conference.
[8] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[9] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[10] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Peter Richtárik,et al. Stochastic Reformulations of Linear Systems: Algorithms and Convergence Theory , 2017, SIAM J. Matrix Anal. Appl..
[15] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[16] Qionghai Dai,et al. A PID Controller Approach for Stochastic Optimization of Deep Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[17] Nathan Srebro,et al. The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.
[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[19] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[20] Lu Fang,et al. Deep Learning for Surface Material Classification Using Haptic and Visual Information , 2015, IEEE Transactions on Multimedia.
[21] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Quoc V. Le,et al. Neural Optimizer Search with Reinforcement Learning , 2017, ICML.
[23] H. Robbins. A Stochastic Approximation Method , 1951 .
[24] Sanjiv Kumar,et al. On the Convergence of Adam and Beyond , 2018 .
[25] Daniel Jiwoong Im,et al. An empirical analysis of the optimization of deep network loss surfaces , 2016, 1612.04010.
[26] Peter Richtárik,et al. Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods , 2017, Computational Optimization and Applications.