论文信息 - Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates

Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates

In this paper, we propose and analyze zeroth-order stochastic approximation algorithms for nonconvex and convex optimization. Specifically, we propose generalizations of the conditional gradient algorithm achieving rates similar to the standard stochastic gradient algorithm using only zeroth-order information. Furthermore, under a structural sparsity assumption, we first illustrate an implicit regularization phenomenon where the standard stochastic gradient algorithm with zeroth-order information adapts to the sparsity of the problem at hand by just varying the stepsize. Next, we propose a truncated stochastic gradient algorithm with zeroth-order information, whose rate depends only poly-logarithmically on the dimensionality.

Krishnakumar Balasubramanian | Saeed Ghadimi | Saeed Ghadimi | K. Balasubramanian

[1] Martin J. Wainwright,et al. Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.

[2] Robert D. Nowak,et al. Query Complexity of Derivative-Free Optimization , 2012, NIPS.

[3] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[4] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[5] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[6] Benjamin Recht,et al. Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.

[7] Katya Scheinberg,et al. Introduction to derivative-free optimization , 2010, Math. Comput..

[8] Richard E. Turner,et al. Structured Evolution with Compact Architectures for Scalable Policy Optimization , 2018, ICML.

[9] Sivaraman Balakrishnan,et al. Stochastic Zeroth-order Optimization in High Dimensions , 2017, AISTATS.

[10] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[11] V. F. Demʹi︠a︡nov,et al. Approximate methods in optimization problems , 1970 .

[12] Haipeng Luo,et al. Variance-Reduced and Projection-Free Stochastic Optimization , 2016, ICML.

[13] J. Mockus. Bayesian Approach to Global Optimization: Theory and Applications , 1989 .

[14] Prateek Jain,et al. Non-convex Optimization for Machine Learning , 2017, Found. Trends Mach. Learn..

[15] Yi Zhou,et al. Conditional Gradient Sliding for Convex Optimization , 2016, SIAM J. Optim..

[16] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[17] Saeed Ghadimi,et al. Conditional gradient type methods for composite nonlinear and stochastic optimization , 2016, Math. Program..

[18] Reuven Y. Rubinstein,et al. Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[19] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .

[20] Ohad Shamir,et al. On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization , 2012, COLT.

[21] M. J. Fryer,et al. Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[22] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[23] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[24] Alexander J. Smola,et al. Stochastic Frank-Wolfe methods for nonconvex optimization , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25] Jinfeng Yi,et al. ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[26] Sophia Decker,et al. Approximate Methods In Optimization Problems , 2016 .

[27] Elad Hazan,et al. Projection-free Online Learning , 2012, ICML.

[28] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[29] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[30] Amin Karbasi,et al. Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap , 2017, AISTATS.

[31] Prateek Jain,et al. On Iterative Hard Thresholding Methods for High-dimensional M-Estimation , 2014, NIPS.

[32] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[33] Amin Karbasi,et al. Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization , 2018, J. Mach. Learn. Res..

[34] Donald W. Hearn,et al. The gap function of a convex program , 1982, Operations Research Letters.

[35] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..