Tight Lower Bounds for Multiplicative Weights Algorithmic Families

We study the fundamental problem of prediction with expert advice and develop regret lower bounds for a large family of algorithms for this problem. We develop simple adversarial primitives, that lend themselves to various combinations leading to sharp lower bounds for many algorithmic families. We use these primitives to show that the classic Multiplicative Weights Algorithm (MWA) has a regret of (T*ln(k)/2)^{0.5} (where T is the time horizon and k is the number of experts), there by completely closing the gap between upper and lower bounds. We further show a regret lower bound of (2/3)* (T*ln(k)/2)^{0.5} for a much more general family of algorithms than MWA, where the learning rate can be arbitrarily varied over time, or even picked from arbitrary distributions over time. We also use our primitives to construct adversaries in the geometric horizon setting for MWA to precisely characterize the regret at 0.391/(\delta)^{0.5} for the case of 2 experts and a lower bound of (1/2)*(ln(k)/(2*\delta))^{0.5}, for the case of arbitrary number of experts k (here \delta is the probability that the game ends in any given round).

[1]  Nicolò Cesa-Bianchi,et al.  Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[2]  Wouter M. Koolen The Pareto Regret Frontier , 2013, NIPS.

[3]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[4]  Wouter M. Koolen,et al.  Second-order Quantile Methods for Experts and Combinatorial Games , 2015, COLT.

[5]  David Haussler,et al.  Tight worst-case loss bounds for predicting with expert advice , 1994, EuroCOLT.

[6]  Ohad Shamir,et al.  Relax and Randomize : From Value to Algorithms , 2012, NIPS.

[7]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[8]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[9]  Ambuj Tewari,et al.  Online Learning: Random Averages, Combinatorial Parameters, and Learnability , 2010, NIPS.

[10]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .

[11]  Wang Feng,et al.  Online Learning Algorithms for Big Data Analytics: A Survey , 2015 .

[12]  Ambuj Tewari,et al.  Online Learning: Beyond Regret , 2010, COLT.

[13]  Peter L. Bartlett,et al.  A Stochastic View of Optimal Regret through Minimax Duality , 2009, COLT.

[14]  Robert E. Schapire,et al.  Learning with continuous experts using drifting games , 2008, Theor. Comput. Sci..

[15]  H. Brendan McMahan,et al.  Minimax Optimal Algorithms for Unconstrained Linear Optimization , 2013, NIPS.

[16]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[17]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[18]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[19]  Yoav Freund,et al.  A Parameter-free Hedging Algorithm , 2009, NIPS.

[20]  Ambuj Tewari,et al.  Optimal Stragies and Minimax Lower Bounds for Online Convex Games , 2008, COLT.

[21]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[22]  Yuval Peres,et al.  Towards Optimal Algorithms for Prediction with Expert Advice , 2014, SODA.

[23]  Francesco Orabona,et al.  Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations , 2014, COLT.

[24]  Manfred K. Warmuth,et al.  When Random Play is Optimal Against an Adversary , 2008, COLT.

[25]  Haipeng Luo,et al.  Towards Minimax Online Learning with Unknown Time Horizon , 2013, ICML.

[26]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[27]  Adam Tauman Kalai,et al.  Universal Portfolios With and Without Transaction Costs , 1997, COLT '97.

[28]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..