On the influence of momentum acceleration on online learning
暂无分享,去创建一个
Ali H. Sayed | Kun Yuan | Bicheng Ying | A. Sayed | K. Yuan | Bicheng Ying | A. H. Sayed
[1] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[2] J. Proakis,et al. Channel identification for high speed digital communications , 1974 .
[3] Kumpati S. Narendra,et al. Adaptation and learning in automatic systems , 1974 .
[4] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[5] S. Haykin,et al. Adaptive Filter Theory , 1986 .
[6] S. Thomas Alexander,et al. Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.
[7] Maurice Bellanger,et al. Adaptive digital filters and signal analysis , 1987 .
[8] J. Shynk,et al. The LMS algorithm with momentum updating , 1988, 1988., IEEE International Symposium on Circuits and Systems.
[9] M. Tugay,et al. Properties of the momentum LMS algorithm , 1989, Proceedings. Electrotechnical Conference Integrating Research, Industry and Education in Energy and Communication Engineering',.
[10] John J. Shynk,et al. Analysis of the momentum LMS algorithm , 1990, IEEE Trans. Acoust. Speech Signal Process..
[11] W. Wiegerinck,et al. Stochastic dynamics of learning with momentum in neural networks , 1994 .
[12] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.
[13] William A. Sethares,et al. Analysis of momentum adaptive filtering algorithms , 1998, IEEE Trans. Signal Process..
[14] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.
[15] Nii O. Attoh-Okine,et al. Analysis of learning rate and momentum term in backpropagation neural network algorithm trained to predict pavement performance , 1999 .
[16] Lok-Kee Ting,et al. Tracking performance of momentum LMS algorithm for a chirped sinusoidal signal , 2000, 2000 10th European Signal Processing Conference.
[17] D. Bertsekas,et al. Convergen e Rate of In remental Subgradient Algorithms , 2000 .
[18] M. Bellanger. Adaptive digital filters , 2001 .
[19] Tong Zhang,et al. Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.
[20] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[21] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..
[22] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[23] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[24] S. Haykin. Adaptive Filters , 2007 .
[25] Alexandre d'Aspremont,et al. Smooth Optimization with Approximate Gradient , 2005, SIAM J. Optim..
[26] Lin Xiao,et al. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..
[27] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..
[28] M. Baes. Estimate sequence methods: extensions and approximations , 2009 .
[29] James T. Kwok,et al. Accelerated Gradient Methods for Stochastic Optimization and Online Learning , 2009, NIPS.
[30] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[31] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[32] Peter J. Haas,et al. Large-scale matrix factorization with distributed stochastic gradient descent , 2011, KDD.
[33] Guanghui Lan,et al. An optimal method for stochastic composite optimization , 2011, Mathematical Programming.
[34] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework , 2012, SIAM J. Optim..
[35] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.
[36] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[37] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[38] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[39] Razvan Pascanu,et al. Combining modality specific deep neural networks for emotion recognition in video , 2013, ICMI '13.
[40] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[41] Volkan Cevher,et al. Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics , 2014, IEEE Signal Processing Magazine.
[42] Ali Sayed,et al. Adaptation, Learning, and Optimization over Networks , 2014, Found. Trends Mach. Learn..
[43] Jakub M. Tomczak,et al. Accelerated learning for Restricted Boltzmann Machine with momentum term , 2014, ICSEng.
[44] Leon Wenliang Zhong,et al. Accelerated Stochastic Gradient Method for Composite Regularization , 2014, AISTATS.
[45] Ali H. Sayed,et al. Adaptive Networks , 2014, Proceedings of the IEEE.
[46] Atsushi Nitanda,et al. Stochastic Proximal Gradient Descent with Acceleration Techniques , 2014, NIPS.
[47] Yurii Nesterov,et al. First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.
[48] Francis R. Bach,et al. From Averaging to Acceleration, There is Only a Step-size , 2015, COLT.
[49] Shai Shalev-Shwartz,et al. SDCA without Duality , 2015, ArXiv.
[50] Sergios Theodoridis,et al. Machine Learning: A Bayesian and Optimization Perspective , 2015 .
[51] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Ali H. Sayed,et al. Performance Limits of Online Stochastic Sub-Gradient Learning , 2015, ArXiv.
[53] Xiang Zhang,et al. Text Understanding from Scratch , 2015, ArXiv.
[54] Zeyuan Allen-Zhu. Katyusha: Accelerated Variance Reduction for Faster SGD , 2016 .
[55] Benjamin Recht,et al. Analysis and Design of Optimization Algorithms via Integral Quadratic Constraints , 2014, SIAM J. Optim..
[56] Zeyuan Allen Zhu,et al. Katyusha: Accelerated Variance Reduction for Faster SGD , 2016, ArXiv.
[57] Ali H. Sayed,et al. Performance limits of single-agent and multi-agent sub-gradient stochastic learning , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[58] Mark Tygert. Poor starting points in machine learning , 2016, ArXiv.
[59] Tong Zhang,et al. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.
[60] YuanKun,et al. On the influence of momentum acceleration on online learning , 2016 .
[61] Zeyuan Allen-Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..
[62] Francis R. Bach,et al. Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..
[63] Ali H. Sayed,et al. Performance limits of stochastic sub-gradient learning, Part I: Single agent case , 2015, Signal Process..
[64] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .