Coin Betting and Parameter-Free Online Learning

In the recent years, a number of parameter-free algorithms have been developed for online linear optimization over Hilbert spaces and for learning with expert advice. These algorithms achieve optimal regret bounds that depend on the unknown competitors, without having to tune the learning rates with oracle choices. We present a new intuitive framework to design parameter-free algorithms for \emph{both} online linear optimization over Hilbert spaces and for learning with expert advice, based on reductions to betting on outcomes of adversarial coins. We instantiate it using a betting algorithm based on the Krichevsky-Trofimov estimator. The resulting algorithms are simple, with no parameters to be tuned, and they improve or match previous results in terms of regret guarantee and per-round complexity.

[1]  Edmund Taylor Whittaker,et al.  A Course of Modern Analysis , 2021 .

[2]  John L. Kelly,et al.  A new interpretation of information rate , 1956, IRE Trans. Inf. Theory.

[3]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[6]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[7]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[10]  H. Alzer Inequalities for the gamma function , 1999 .

[11]  L. Gordon,et al.  The Gamma Function , 1994, Series and Products in the Development of Mathematics.

[12]  Chao Chen Inequalities for the Polygamma Functions with Application , 2005 .

[13]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[14]  H. Robbins A Stochastic Approximation Method , 1951 .

[15]  A. Hoorfar,et al.  INEQUALITIES ON THE LAMBERTW FUNCTION AND HYPERPOWER FUNCTION , 2008 .

[16]  Yoav Freund,et al.  A Parameter-free Hedging Algorithm , 2009, NIPS.

[17]  Ambuj Tewari,et al.  Smoothness, Low Noise and Fast Rates , 2010, NIPS.

[18]  Vladimir Vovk,et al.  Prediction with Advice of Unknown Number of Experts , 2010, UAI.

[19]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[20]  Matthew J. Streeter,et al.  No-Regret Algorithms for Unconstrained Online Convex Optimization , 2012, NIPS.

[21]  Ohad Shamir,et al.  Relax and Randomize : From Value to Algorithms , 2012, NIPS.

[22]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[23]  Francesco Orabona,et al.  Dimension-Free Exponentiated Gradient , 2013, NIPS.

[24]  David A. McAllester A PAC-Bayesian Tutorial with A Dropout Bound , 2013, ArXiv.

[25]  H. Brendan McMahan,et al.  Minimax Optimal Algorithms for Unconstrained Linear Optimization , 2013, NIPS.

[26]  Haipeng Luo,et al.  A Drifting-Games Analysis for Online Learning and Applications to Boosting , 2014, NIPS.

[27]  Francesco Orabona,et al.  Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning , 2014, NIPS.

[28]  Francesco Orabona,et al.  Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations , 2014, COLT.

[29]  Karthik Sridharan,et al.  Adaptive Online Learning , 2015, NIPS.

[30]  Wouter M. Koolen,et al.  Second-order Quantile Methods for Experts and Combinatorial Games , 2015, COLT.

[31]  Haipeng Luo,et al.  Achieving All with No Parameters: AdaNormalHedge , 2015, COLT.