论文信息 - ZEROTH-ORDER STOCHASTIC PROJECTED GRADIENT DESCENT FOR NONCONVEX OPTIMIZATION - 字舞流文

ZEROTH-ORDER STOCHASTIC PROJECTED GRADIENT DESCENT FOR NONCONVEX OPTIMIZATION

In this paper, we analyze the convergence of the zeroth-order stochastic projected gradient descent (ZO-SPGD) method for constrained convex and nonconvex optimization scenarios where only objective function values (not gradients) are directly available. We show statistical properties of a new random gradient estimator, constructed through random direction samples drawn from a bounded uniform distribution. We prove that ZO-SPGD yields a $O\left( {\frac{d}{{bq\sqrt T }} + \frac{1}{{\sqrt T }}} \right)$ convergence rate for convex but non-smooth optimization, where d is the number of optimization variables, b is the minibatch size, q is the number of random direction samples for gradient estimation, and T is the number of iterations. For nonconvex optimization, we show that ZO-SPGD achieves $O\left( {\frac{1}{{\sqrt T }}} \right)$ convergence rate but suffers an additional $O\left( {\frac{{d + q}}{{bq}}} \right)$ error. Our the oretical investigation on ZO-SPGD provides a general framework to study the convergence rate of zeroth-order algorithms.

Jarvis D. Haupt | Lisa Amini | Sijia Liu | Xingguo Li | Pin-Yu Chen | Xingguo Li | J. Haupt | Pin-Yu Chen | Lisa Amini | Sijia Liu

[1] Saeed Ghadimi,et al. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization , 2013, Mathematical Programming.

[2] Mingyi Hong,et al. Zeroth Order Nonconvex Multi-Agent Optimization over Networks , 2017 .

[3] Alexander J. Smola,et al. Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization , 2016, NIPS.

[4] A. Hero,et al. Supplementary Material for Zeroth-Order Online Alternating Direction Method of Multipliers : Convergence Analysis and Applications , 2018 .

[5] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[6] Shiyu Chang,et al. Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization , 2018, NeurIPS.

[7] Xiang Gao,et al. On the Information-Adaptive Variants of the ADMM: An Iteration Complexity Perspective , 2017, Journal of Scientific Computing.

[8] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[9] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[10] Mingyi Hong,et al. ZENITH : A Zeroth-Order Distributed Algorithm for Multi-Agent Nonconvex Optimization , 2016 .

[11] Liu Liu,et al. Stochastic Zeroth-order Optimization via Variance Reduction method , 2018, ArXiv.

[12] Jinfeng Yi,et al. AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks , 2018, AAAI.

[13] Ohad Shamir,et al. An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback , 2015, J. Mach. Learn. Res..

[14] Prateek Jain,et al. Non-convex Optimization for Machine Learning , 2017, Found. Trends Mach. Learn..

[15] Martin J. Wainwright,et al. Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.

[16] Robert D. Nowak,et al. Query Complexity of Derivative-Free Optimization , 2012, NIPS.

[17] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[18] Jinfeng Yi,et al. ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[19] Lin Xiao,et al. Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback. , 2010, COLT 2010.

[20] Karan Singh,et al. Efficient Regret Minimization in Non-Convex Games , 2017, ICML.

[21] Cho-Jui Hsieh,et al. A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order , 2016, NIPS.