Recent Progress in Zeroth Order Optimization and Its Applications to Adversarial Robustness in Data Mining and Machine Learning

Zeroth-order (ZO) optimization is increasingly embraced for solving big data and machine learning problems when explicit expressions of the gradients are difficult or infeasible to obtain. It achieves gradient-free optimization by approximating the full gradient via efficient gradient estimators. Some recent important applications include: a) generation of prediction-evasive, black-box adversarial attacks on deep neural networks, b) online network management with limited computation capacity, c) parameter inference of black-box/complex systems, and d) bandit optimization in which a player receives partial feedback in terms of loss function values revealed by her adversary. This tutorial aims to provide a comprehensive introduction to recent advances in ZO optimization methods in both theory and applications. On the theory side, we will cover convergence rate and iteration complexity analysis of ZO algorithms and make comparisons to their first-order counterparts. On the application side, we will highlight one appealing application of ZO optimization to studying the robustness of deep neural networks - practical and efficient adversarial attacks that generate adversarial examples from a black-box machine learning model. We will also summarize potential research directions regarding ZO optimization, big data challenges and some open-ended data mining and machine learning problems.

[1]  Alfred O. Hero,et al.  Zeroth-Order Online Alternating Direction Method of Multipliers: Convergence Analysis and Applications , 2017, AISTATS.

[2]  Mingyi Hong,et al.  On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization , 2018, ICLR.

[3]  Jarvis D. Haupt,et al.  ZEROTH-ORDER STOCHASTIC PROJECTED GRADIENT DESCENT FOR NONCONVEX OPTIMIZATION , 2018, 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[4]  Martin J. Wainwright,et al.  Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.

[5]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[6]  Alexander J. Smola,et al.  Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization , 2016, NIPS.

[7]  Cho-Jui Hsieh,et al.  A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order , 2016, NIPS.

[8]  Kamyar Azizzadenesheli,et al.  signSGD: compressed optimisation for non-convex problems , 2018, ICML.

[9]  Shiyu Chang,et al.  Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization , 2018, NeurIPS.

[10]  Jinfeng Yi,et al.  Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[11]  Jinfeng Yi,et al.  AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks , 2018, AAAI.

[12]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[13]  Xiang Gao,et al.  On the Information-Adaptive Variants of the ADMM: An Iteration Complexity Perspective , 2017, Journal of Scientific Computing.

[14]  Mingyi Hong,et al.  Zeroth Order Nonconvex Multi-Agent Optimization over Networks , 2017 .

[15]  Mingyi Hong,et al.  signSGD via Zeroth-Order Oracle , 2019, ICLR.