A continuous-time approach to online optimization

We consider a family of learning strategies for online optimization problems that evolve in continuous time and we show that they lead to no regret. From a more traditional, discrete-time viewpoint, this continuous-time approach allows us to derive the no-regret properties of a large class of discrete-time algorithms including as special cases the exponential weight algorithm, online mirror descent, smooth fictitious play and vanishingly smooth fictitious play. In so doing, we obtain a unified view of many classical regret bounds, and we show that they can be decomposed into a term stemming from continuous-time considerations and a term which measures the disparity between discrete and continuous time. As a result, we obtain a general class of infinite horizon learning strategies that guarantee an $\mathcal{O}(n^{-1/2})$ regret bound without having to resort to a doubling trick.

[1]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[2]  Sylvain Sorin,et al.  Exponential weight algorithm in continuous time , 2008, Math. Program..

[3]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[4]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[5]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[6]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[7]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions, Part II: Applications , 2006, Math. Oper. Res..

[8]  Elad Hazan The convex optimization approach to regret minimization , 2011 .

[9]  William H. Sandholm,et al.  ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .

[10]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[11]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[12]  Michel Benaïm,et al.  Consistency of Vanishingly Smooth Fictitious Play , 2011, Math. Oper. Res..

[13]  Sébastien Bubeck,et al.  Introduction to Online Optimization , 2011 .

[14]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[15]  A. Mas-Colell,et al.  Microeconomic Theory , 1995 .

[16]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[17]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[18]  Nicolò Cesa-Bianchi,et al.  Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[19]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[20]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..

[21]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[22]  P. Hall,et al.  Martingale Limit Theory and Its Application , 1980 .

[23]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[24]  Claudio Gentile,et al.  Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[25]  Ambuj Tewari,et al.  Regularization Techniques for Learning with Matrices , 2009, J. Mach. Learn. Res..

[26]  D. Fudenberg,et al.  Consistency and Cautious Fictitious Play , 1995 .

[27]  Shai Shalev-Shwartz,et al.  Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .

[28]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[29]  F. R. Rosendaal,et al.  Prediction , 2015, Journal of thrombosis and haemostasis : JTH.

[30]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[31]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[32]  Yurii Nesterov,et al.  Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[33]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[34]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..