Potential-Based Algorithms in Online Prediction and Game Theory

In this paper we show that several known algorithms for sequential prediction problems (including the quasi-additive family of Grove et al. and Littlestone and Warmuth's Weighted Majority), for playing iterated games (including Freund and Schapire's Hedge and MW, as well as the Λ-strategies of Hart and Mas-Colell), and for boosting (including AdaBoost) are special cases of a general decision strategy based on the notion of potential. By analyzing this strategy we derive known performance bounds, as well as new bounds, as simple corollaries of a single general theorem. Besides offering a new and unified view on a large family of algorithms, we establish a connection between potential-based analysis in learning and their counterparts independently developed in game theory. By exploiting this connection, we show that certain learning problems are instances of more general game-theoretic problems. In particular, we describe a notion of generalized regret and show its applications in learning theory.

[1]  David Haussler,et al.  Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[2]  S. Hart,et al.  A General Class of Adaptive Strategies , 1999 .

[3]  N. Littlestone Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[4]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[5]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[6]  Claudio Gentile,et al.  Linear Hinge Loss and Average Margin , 1998, NIPS.

[7]  Claudio Gentile,et al.  The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[8]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[9]  Dale Schuurmans,et al.  General Convergence Results for Linear Discriminant Updates , 1997, COLT '97.

[10]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[11]  Ehud Lehrer,et al.  A wide range no-regret theorem , 2003, Games Econ. Behav..

[12]  R. Vohra,et al.  Calibrated Learning and Correlated Equilibrium , 1996 .

[13]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[14]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[15]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[16]  Nicolò Cesa-Bianchi,et al.  Analysis of Two Gradient-Based Algorithms for On-Line Regression , 1999 .

[17]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[18]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[19]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[21]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[22]  Yoram Singer,et al.  Using and combining predictors that specialize , 1997, STOC '97.

[23]  Manfred K. Warmuth,et al.  Averaging Expert Predictions , 1999, EuroCOLT.

[24]  H. D. Block The perceptron: a model for brain functioning. I , 1962 .

[25]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .