论文信息 - Tight Lower Bounds for the Multiplicative Weights Algorithm on the Experts Problem

Tight Lower Bounds for the Multiplicative Weights Algorithm on the Experts Problem

We develop tight lower bounds for the regret achievable by the widely used Multiplicative Weights Algorithm (MWA) on the fundamental problem of prediction with expert advice. It is well known that with k experts and T steps, MWA suffers a (minimax) regret of at most √ T ln k 2 [7]. In this work, we show that as T → ∞, MWA suffers a regret of at least √ T ln k 2 for every even k, thereby precisely characterizing MWA’s regret. We develop an almost identical characterization for every odd k > 1, where we show that as T →∞, MWA suffers a regret of at least √ T ln k 2 (1− 1 k2 ). The previously best known lower bound on MWA’s regret was 1 4 √ T log2 k [10], which is a factor 2.35 smaller than the optimal regret. In the equally natural geometric horizon model where the number of steps is a geometric random variable with mean 1δ , it is known that MWA suffers a regret of at most √ ln k 2δ . For every k ≥ 2, we show that as δ → 0, MWA suffers a regret of at least 1 2 √ ln k 2δ , there by shrinking the gap between known upper and lower bounds to factor 2. For the special case of k = 2 experts, we close the gap completely and precisely characterize MWA’s regret at 0.391 √ δ as δ → 0. We obtain the lower bounds by characterizing the structure of the optimal adversary. We show that the optimal adversary is obtained by a careful combination of two simple adversaries that exploit the weaknesses of MWA in opposite ways. Strikingly, despite the apparent similarity between the finite and geometric horizon models, we show that the structure of the optimal adversary for the finite and geometric horizon models are mirror-images of each other in a strong sense. ∗Microsoft Research. One Memorial Drive, Cambridge, MA 02142. ngravin@gmail.com. †Microsoft Research. One Microsoft Way, Redmond, WA 98052. peres@microsoft.com. ‡Google Research. 111 8th Ave, New York, NY 10011. balusivan@google.com.

N. Gravin | Y. Peres | Balasubramanian Sivan

[1] Yuval Peres,et al. Towards Optimal Algorithms for Prediction with Expert Advice , 2014, SODA.

[2] Wang Feng,et al. Online Learning Algorithms for Big Data Analytics: A Survey , 2015 .

[3] Francesco Orabona,et al. Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations , 2014, COLT.

[4] Haipeng Luo,et al. Towards Minimax Online Learning with Unknown Time Horizon , 2013, ICML.

[5] Wouter M. Koolen. The Pareto Regret Frontier , 2013, NIPS.

[6] H. Brendan McMahan,et al. Minimax Optimal Algorithms for Unconstrained Linear Optimization , 2013, NIPS.

[7] Ohad Shamir,et al. Relax and Randomize : From Value to Algorithms , 2012, NIPS.

[8] Ambuj Tewari,et al. Online Learning: Beyond Regret , 2010, COLT.

[9] Ambuj Tewari,et al. Online Learning: Random Averages, Combinatorial Parameters, and Learnability , 2010, NIPS.

[10] Peter L. Bartlett,et al. A Stochastic View of Optimal Regret through Minimax Duality , 2009, COLT.

[11] Robert E. Schapire,et al. Learning with continuous experts using drifting games , 2008, Theor. Comput. Sci..