Online Optimization Using Zeroth Order Oracles

This letter considers the iterative numerical optimization of time-varying cost functions where no gradient information is available at each iteration. In this case, the proposed algorithm estimates a directional derivative by finite differences. The main contributions are the derivation of error bounds for such algorithms and proposal of optimal algorithm parameter values, e.g., step-sizes, for strongly convex cost functions. The algorithm is applied to tackle a source localization problem using a sensing agent where the source actively evades the agent. Numerical examples are provided to illustrate the theoretical results.

[1]  Martin J. Wainwright,et al.  Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.

[2]  Georgios B. Giannakis,et al.  An Online Convex Optimization Approach to Real-Time Energy Pricing for Demand Response , 2017, IEEE Transactions on Smart Grid.

[3]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[4]  Andrea Simonetto Time-Varying Convex Optimization via Time-Varying Averaged Operators , 2017, 1704.07338.

[5]  Jonathan H. Manton,et al.  Numerical Optimisation of Time-Varying Strongly Convex Functions Subject to Time-Varying Constraints , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[6]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[7]  Yurii Nesterov,et al.  Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[8]  Ohad Shamir,et al.  An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback , 2015, J. Mach. Learn. Res..

[9]  Katya Scheinberg,et al.  Introduction to derivative-free optimization , 2010, Math. Comput..

[10]  Alejandro Ribeiro,et al.  D-MAP: Distributed Maximum a Posteriori Probability Estimation of Dynamic Systems , 2013, IEEE Transactions on Signal Processing.

[11]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[12]  A. Yu. Popkov,et al.  Gradient Methods for Nonstationary Unconstrained Optimization Problems , 2005 .

[13]  Mark W. Schmidt,et al.  Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.

[14]  Omar Besbes,et al.  Non-Stationary Stochastic Optimization , 2013, Oper. Res..

[15]  Lin Xiao,et al.  Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback. , 2010, COLT 2010.

[16]  Chia-Jung Lee,et al.  Beating Bandits in Gradually Evolving Worlds , 2013, COLT.

[17]  Jinfeng Yi,et al.  Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient , 2016, ICML.

[18]  Adam Tauman Kalai,et al.  Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[19]  Lutz Gröll,et al.  Lateral Vehicle Trajectory Optimization Using Constrained Linear Time-Varying MPC , 2017, IEEE Transactions on Intelligent Transportation Systems.

[20]  Georgios B. Giannakis,et al.  Bandit Convex Optimization for Scalable and Dynamic IoT Management , 2017, IEEE Internet of Things Journal.

[21]  Ketan Rajawat,et al.  Online Learning With Inexact Proximal Online Gradient Descent Algorithms , 2018, IEEE Transactions on Signal Processing.

[22]  Jonathan H. Manton,et al.  A framework for generalising the Newton method and other iterative methods from Euclidean space to manifolds , 2012, Numerische Mathematik.