论文信息 - A continuous-time approach to online optimization

A continuous-time approach to online optimization

We consider a family of learning strategies for online optimization problems that evolve in continuous time and we show that they lead to no regret. From a more traditional, discrete-time viewpoint, this continuous-time approach allows us to derive the no-regret properties of a large class of discrete-time algorithms including as special cases the exponential weight algorithm, online mirror descent, smooth fictitious play and vanishingly smooth fictitious play. In so doing, we obtain a unified view of many classical regret bounds, and we show that they can be decomposed into a term stemming from continuous-time considerations and a term which measures the disparity between discrete and continuous time. As a result, we obtain a general class of infinite horizon learning strategies that guarantee an $\mathcal{O}(n^{-1/2})$ regret bound without having to resort to a doubling trick.

Panayotis Mertikopoulos | Joon Kwon | Joon Kwon | P. Mertikopoulos

[1] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.

[2] Sylvain Sorin,et al. Exponential weight algorithm in continuous time , 2008, Math. Program..

[3] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..

[4] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[5] David Haussler,et al. How to use expert advice , 1993, STOC.

[6] Vladimir Vovk,et al. A game of prediction with expert advice , 1995, COLT '95.

[7] Josef Hofbauer,et al. Stochastic Approximations and Differential Inclusions, Part II: Applications , 2006, Math. Oper. Res..

[8] Elad Hazan. The convex optimization approach to regret minimization , 2011 .

[9] William H. Sandholm,et al. ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .

[10] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[11] 丸山徹. Convex Analysisの二,三の進展について , 1977 .

[12] Michel Benaïm,et al. Consistency of Vanishingly Smooth Fictitious Play , 2011, Math. Oper. Res..

[13] Sébastien Bubeck,et al. Introduction to Online Optimization , 2011 .

[14] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[15] A. Mas-Colell,et al. Microeconomic Theory , 1995 .

[16] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .

[17] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[18] Nicolò Cesa-Bianchi,et al. Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[19] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[20] Josef Hofbauer,et al. Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..

[21] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[22] P. Hall,et al. Martingale Limit Theory and Its Application , 1980 .

[23] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[24] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[25] Ambuj Tewari,et al. Regularization Techniques for Learning with Matrices , 2009, J. Mach. Learn. Res..

[26] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .

[27] Shai Shalev-Shwartz,et al. Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .

[28] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[29] F. R. Rosendaal,et al. Prediction , 2015, Journal of thrombosis and haemostasis : JTH.

[30] Bastian Goldlücke,et al. Variational Analysis , 2014, Computer Vision, A Reference Guide.

[31] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[32] Yurii Nesterov,et al. Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[33] D. Fudenberg,et al. Conditional Universal Consistency , 1999 .

[34] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..