Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent

We study Smoothed Online Convex Optimization, a version of online convex optimization where the learner incurs a penalty for changing her actions between rounds. Given a $\Omega(\sqrt{d})$ lower bound on the competitive ratio of any online algorithm, where $d$ is the dimension of the action space, we ask under what conditions this bound can be beaten. We introduce a novel algorithmic framework for this problem, Online Balanced Descent (OBD), which works by iteratively projecting the previous point onto a carefully chosen level set of the current cost function so as to balance the switching costs and hitting costs. We demonstrate the generality of the OBD framework by showing how, with different choices of "balance," OBD can improve upon state-of-the-art performance guarantees for both competitive ratio and regret, in particular, OBD is the first algorithm to achieve a dimension-free competitive ratio, $3 + O(1/\alpha)$, for locally polyhedral costs, where $\alpha$ measures the "steepness" of the costs. We also prove bounds on the dynamic regret of OBD when the balance is performed in the dual space that are dimension-free and imply that OBD has sublinear static regret.

[1]  S. Kakade,et al.  On the duality of strong convexity and strong smoothness : Learning applications and matrix regularization , 2009 .

[2]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[3]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[4]  Nathan Linial,et al.  On convex body chasing , 1993, Discret. Comput. Geom..

[5]  Avrim Blum,et al.  On-line Learning and the Metrical Task System Problem , 1997, COLT '97.

[6]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[7]  Hamed Mohsenian Rad,et al.  Exploring smart grid and data center interactions for electric power load balancing , 2014, PERV.

[8]  Lachlan L. H. Andrew,et al.  A tale of two metrics: simultaneous bounds on competitiveness and regret , 2013, SIGMETRICS '13.

[9]  Joseph Naor,et al.  Competitive Analysis via Regularization , 2014, SODA.

[10]  Peter L. Bartlett,et al.  A Regularization Approach to Metrical Task Systems , 2010, ALT.

[11]  Rebecca Willett,et al.  Dynamical Models and tracking regret in online convex programming , 2013, ICML.

[12]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[13]  Yuval Rabani,et al.  A decomposition theorem and bounds for randomized server problems , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[14]  Lachlan L. H. Andrew,et al.  Online algorithms for geographical load balancing , 2012, 2012 International Green Computing Conference (IGCC).

[15]  Roi Livni,et al.  Multi-Armed Bandits with Metric Movement Costs , 2017, NIPS.

[16]  Mark Herbster,et al.  Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..

[17]  Adam Tauman Kalai,et al.  Static Optimality and Dynamic Search-Optimality in Lists and Trees , 2002, SODA '02.

[18]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[19]  Peter L. Bartlett,et al.  Implicit Online Learning , 2010, ICML.

[20]  Lachlan L. H. Andrew,et al.  Online Convex Optimization Using Predictions , 2015, SIGMETRICS.

[21]  Minghua Chen,et al.  Simple and effective dynamic provisioning for power-proportional data centers , 2011, 2012 46th Annual Conference on Information Sciences and Systems (CISS).

[22]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[23]  Na Li,et al.  Online Optimization With Predictions and Switching Costs: Fast Algorithms and the Fundamental Limit , 2018, IEEE Transactions on Automatic Control.

[24]  Yisong Yue,et al.  A Decision Tree Framework for Spatiotemporal Sequence Prediction , 2015, KDD.

[25]  R. Agrawal,et al.  Multi-armed bandit problems with multiple plays and switching cost , 1990 .

[26]  Stephen P. Boyd,et al.  Online convex optimization-based algorithm for thermal management of MPSoCs , 2010, GLSVLSI '10.

[27]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[28]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[29]  Kirk Pruhs,et al.  A 2-Competitive Algorithm For Online Convex Optimization With Switching Costs , 2015, APPROX-RANDOM.

[30]  Joseph Naor,et al.  Unified Algorithms for Online Learning and Competitive Analysis , 2012, COLT.

[31]  Georgios B. Giannakis,et al.  Real-time electricity pricing for demand response using online convex optimization , 2014, ISGT 2014.

[32]  Adam Wierman,et al.  Online convex optimization with ramp constraints , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[33]  Longbo Huang,et al.  Delay reduction via Lagrange multipliers in stochastic network optimization , 2009, IEEE Transactions on Automatic Control.

[34]  Gustavo de Veciana,et al.  Jointly optimizing multi-user rate adaptation for video transport over wireless systems: Mean-fairness-variability tradeoffs , 2012, 2012 Proceedings IEEE INFOCOM.

[35]  Allan Borodin,et al.  An optimal on-line algorithm for metrical task system , 1992, JACM.

[36]  Adam Wierman,et al.  Thinking fast and slow: Optimization decomposition across timescales , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[37]  Adam Wierman,et al.  Using Predictions in Online Optimization: Looking Forward with an Eye on the Past , 2016, SIGMETRICS.

[38]  James R. Lee,et al.  k-server via multiscale entropic regularization , 2017, STOC.

[39]  Sudipto Guha,et al.  Multi-armed Bandits with Metric Switching Costs , 2009, ICALP.

[40]  Lachlan L. H. Andrew,et al.  Dynamic Right-Sizing for Power-Proportional Data Centers , 2011, IEEE/ACM Transactions on Networking.

[41]  Sergio Verdú,et al.  Upper bounds on the relative entropy and Rényi divergence as a function of total variation distance for finite alphabets , 2015, 2015 IEEE Information Theory Workshop - Fall (ITW).

[42]  Giovanni De Micheli,et al.  Multicore thermal management with model predictive control , 2009, 2009 European Conference on Circuit Theory and Design.

[43]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..