Potential-Based Algorithms in On-Line Prediction and Game Theory ∗

In this paper we show that several known algorithms for sequential prediction problems (including Weighted Majority and the quasi-additive family of Grove, Littlestone, and Schuurmans), for playing iterated games (including Freund and Schapire’s Hedge and MW, as well as the -strategies of Hart and Mas-Colell), and for boosting (including AdaBoost) are special cases of a general decision strategy based on the notion of potential. By analyzing this strategy we derive known performance bounds, as well as new bounds, as simple corollaries of a single general theorem. Besides offering a new and unified view on a large family of algorithms, we establish a connection between potential-based analysis in learning and their counterparts independently developed in game theory. By exploiting this connection, we show that certain learning problems are instances of more general gametheoretic problems. In particular, we describe a notion of generalized regret and show its applications in learning theory.

[1]  Ehud Lehrer,et al.  A wide range no-regret theorem , 2003, Games Econ. Behav..

[2]  Claudio Gentile,et al.  A Second-Order Perceptron Algorithm , 2002, SIAM J. Comput..

[3]  Claudio Gentile,et al.  Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[4]  V. Vovk Competitive On‐line Statistics , 2001 .

[5]  Andreu Mas-Colell,et al.  A General Class of Adaptive Strategies , 1999, J. Econ. Theory.

[6]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[7]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[8]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[9]  Claudio Gentile,et al.  The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[10]  Robert E. Schapire,et al.  Drifting Games , 1999, COLT '99.

[11]  Manfred K. Warmuth,et al.  Averaging Expert Predictions , 1999, EuroCOLT.

[12]  Claudio Gentile,et al.  Linear Hinge Loss and Average Margin , 1998, NIPS.

[13]  David Haussler,et al.  Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[14]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[15]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[16]  Dale Schuurmans,et al.  General Convergence Results for Linear Discriminant Updates , 1997, COLT '97.

[17]  Nicolò Cesa-Bianchi,et al.  Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[18]  Yoram Singer,et al.  Using and combining predictors that specialize , 1997, STOC '97.

[19]  Dean P. Foster,et al.  Calibrated Learning and Correlated Equilibrium , 1997 .

[20]  Yoram Singer,et al.  Context-sensitive learning methods for text categorization , 1996, SIGIR '96.

[21]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[22]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[23]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[24]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[25]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[26]  N. Littlestone Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[27]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[28]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[29]  Albert B Novikoff,et al.  ON CONVERGENCE PROOFS FOR PERCEPTRONS , 1963 .

[30]  H. D. Block The perceptron: a model for brain functioning. I , 1962 .

[31]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[32]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[33]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .