Optimal amortized regret in every interval

Consider the classical problem of predicting the next bit in a sequence of bits. A standard performance measure is {\em regret} (loss in payoff) with respect to a set of experts. For example if we measure performance with respect to two constant experts one that always predicts 0's and another that always predicts 1's it is well known that one can get regret $O(\sqrt T)$ with respect to the best expert by using, say, the weighted majority algorithm. But this algorithm does not provide performance guarantee in any interval. There are other algorithms that ensure regret $O(\sqrt {x \log T})$ in any interval of length $x$. In this paper we show a randomized algorithm that in an amortized sense gets a regret of $O(\sqrt x)$ for any interval when the sequence is partitioned into intervals arbitrarily. We empirically estimated the constant in the $O()$ for $T$ upto 2000 and found it to be small -- around 2.1. We also experimentally evaluate the efficacy of this algorithm in predicting high frequency stock data.

[1]  Seshadhri Comandur,et al.  Efficient learning algorithms for changing environments , 2009, ICML '09.

[2]  Yishay Mansour,et al.  From External to Internal Regret , 2005, J. Mach. Learn. Res..

[3]  Thomas M. Cover,et al.  Behavior of sequential predictors of binary sequences , 1965 .

[4]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[5]  Vladimir Vovk,et al.  Derandomizing Stochastic Prediction Strategies , 1997, COLT '97.

[6]  Jacob Abernethy,et al.  Optimal strategies from random walks , 2008, COLT 2008.

[7]  John Langford,et al.  Continuous Experts and the Binning Algorithm , 2006, COLT.

[8]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[9]  Jaroslav Kožešnk,et al.  Information Theory, Statistical Decision Functions, Random Processes , 1962 .

[10]  T. Cover Universal Portfolios , 1996 .

[11]  Avrim Blum,et al.  Empirical Support for Winnow and Weighted-Majority Algorithms: Results on a Calendar Scheduling Domain , 2004, Machine Learning.

[12]  Robert E. Schapire,et al.  Algorithms for portfolio management based on the Newton method , 2006, ICML.

[13]  Rina Panigrahy,et al.  Prediction strategies without loss , 2010, NIPS.

[14]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[15]  Jean-Yves Audibert,et al.  Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.

[16]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[17]  Yoram Singer,et al.  Using and combining predictors that specialize , 1997, STOC '97.

[18]  Robert E. Schapire,et al.  Learning with continuous experts using drifting games , 2008, Theor. Comput. Sci..

[19]  Alexandr Andoni,et al.  A Differential Equations Approach to Optimizing Regret Trade-offs , 2013, ArXiv.

[20]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[21]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[22]  Ambuj Tewari,et al.  Online Learning: Beyond Regret , 2010, COLT.

[23]  Yoram Singer,et al.  On‐Line Portfolio Selection Using Multiplicative Updates , 1998, ICML.

[24]  Yishay Mansour,et al.  Regret to the best vs. regret to the average , 2007, Machine Learning.

[25]  Československá akademie věd,et al.  Transactions of the Fourth Prague Conference on Information Theory, Statistical Decision Functions, Random Processes, held at Prague, from 31st August to 11th September 1965 , 1967 .

[26]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.