论文信息 - Explicit Gradient Learning for Black-Box Optimization

Explicit Gradient Learning for Black-Box Optimization

Black-Box Optimization (BBO) methods can find optimal policies for systems that interact with complex environments with no analytical representation. As such, they are of interest in many Artificial Intelligence (AI) domains. Yet classical BBO methods fall short in high-dimensional non-convex problems. They are thus often overlooked in real-world AI tasks. Here we present a BBO method, termed Explicit Gradient Learning (EGL), that is designed to optimize highdimensional ill-behaved functions. We derive EGL by finding weak spots in methods that fit the objective function with a parametric Neural Network (NN) model and obtain the gradient signal by calculating the parametric gradient. Instead of fitting the function, EGL trains a NN to estimate the objective gradient directly. We prove the convergence of EGL to a stationary point and its robustness in the optimization of integrable functions. We evaluate EGL and achieve state-ofthe-art results in two challenging problems: (1) the COCO test suite against an assortment of standard BBO methods; and (2) in a high-dimensional non-convex image generation task.

[1] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[2] Michael I. Jordan,et al. Gradient Descent Converges to Minimizers , 2016, ArXiv.

[3] Jeremy Howard,et al. fastai: A Layered API for Deep Learning , 2020, Inf..

[4] Heinrich Müller,et al. SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[6] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[7] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8] Prabhat,et al. Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[11] Vladlen Koltun,et al. Learning to Guide Random Search , 2020, ICLR.

[12] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[13] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[14] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[15] Saeed Saremi,et al. On approximating ∇f with neural networks , 2019, ArXiv.

[16] Sarit Kraus,et al. Emergency Department Online Patient-Caregiver Scheduling , 2019, AAAI.

[17] Jascha Sohl-Dickstein,et al. Guided evolutionary strategies: augmenting random search with surrogate gradients , 2018, ICML.

[18] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[19] Bolei Zhou,et al. Seeing What a GAN Cannot Generate , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[21] John L. Nazareth,et al. Introduction to derivative-free optimization , 2010, Math. Comput..

[22] Anne Auger,et al. Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009 , 2010, GECCO '10.

[23] Ameet Talwalkar,et al. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[24] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[25] Josephine Sullivan,et al. One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26] C. Reinsch. Smoothing by spline functions , 1967 .

[27] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[28] M. J. D. Powell,et al. An efficient method for finding the minimum of a function of several variables without calculating derivatives , 1964, Comput. J..

[29] M. Powell. A View of Algorithms for Optimization without Derivatives 1 , 2007 .

[30] Michèle Sebag,et al. Collaborative hyperparameter tuning , 2013, ICML.

[31] Yuhui Zheng,et al. Recent Progress on Generative Adversarial Networks (GANs): A Survey , 2019, IEEE Access.

[32] J. Andrew Bagnell,et al. Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective , 2019, AISTATS.

[33] Andrew Gordon Wilson,et al. BoTorch: Programmable Bayesian Optimization in PyTorch , 2019, ArXiv.

[34] Lu Zhen,et al. A simulation optimization framework for ambulance deployment and relocation problems , 2014, Comput. Ind. Eng..

[35] Amnon Shashua,et al. Inductive Bias of Deep Convolutional Networks through Pooling Geometry , 2016, ICLR.

[36] Peter Rossmanith,et al. Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[37] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] D. Sculley,et al. Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[39] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[40] Benjamin Recht,et al. Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.

[41] Nikolaos V. Sahinidis,et al. Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..

[42] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[43] Thomas Bäck,et al. Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[44] Jun Wang,et al. Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction , 2016, ECIR.

[45] Charles Audet,et al. Derivative-Free and Blackbox Optimization , 2017 .

[46] Pan He,et al. Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[47] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[48] John A. Nelder,et al. A Simplex Method for Function Minimization , 1965, Comput. J..

[49] Katya Scheinberg,et al. Stochastic optimization using a trust-region method and random models , 2015, Mathematical Programming.

[50] Simon M. Lucas,et al. Evolving mario levels in the latent space of a deep convolutional generative adversarial network , 2018, GECCO.

[51] Dimitri P. Bertsekas,et al. Convex Optimization Algorithms , 2015 .

[52] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[53] Jean Charles Gilbert,et al. Numerical Optimization: Theoretical and Practical Aspects , 2003 .

[54] Jorge J. Moré,et al. Digital Object Identifier (DOI) 10.1007/s101070100263 , 2001 .