论文信息 - Potential-Based Algorithms in Online Prediction and Game Theory - 字舞流文

Potential-Based Algorithms in Online Prediction and Game Theory

In this paper we show that several known algorithms for sequential prediction problems (including the quasi-additive family of Grove et al. and Littlestone and Warmuth's Weighted Majority), for playing iterated games (including Freund and Schapire's Hedge and MW, as well as the Λ-strategies of Hart and Mas-Colell), and for boosting (including AdaBoost) are special cases of a general decision strategy based on the notion of potential. By analyzing this strategy we derive known performance bounds, as well as new bounds, as simple corollaries of a single general theorem. Besides offering a new and unified view on a large family of algorithms, we establish a connection between potential-based analysis in learning and their counterparts independently developed in game theory. By exploiting this connection, we show that certain learning problems are instances of more general game-theoretic problems. In particular, we describe a notion of generalized regret and show its applications in learning theory.

Nicolò Cesa-Bianchi | Gábor Lugosi | G. Lugosi | N. Cesa-Bianchi

[1] David Haussler,et al. Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[2] S. Hart,et al. A General Class of Adaptive Strategies , 1999 .

[3] N. Littlestone. Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[4] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.

[5] Vladimir Vovk,et al. A game of prediction with expert advice , 1995, COLT '95.

[6] Claudio Gentile,et al. Linear Hinge Loss and Average Margin , 1998, NIPS.

[7] Claudio Gentile,et al. The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[8] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..

[9] Dale Schuurmans,et al. General Convergence Results for Linear Discriminant Updates , 1997, COLT '97.

[10] D. Fudenberg,et al. Conditional Universal Consistency , 1999 .

[11] Ehud Lehrer,et al. A wide range no-regret theorem , 2003, Games Econ. Behav..

[12] R. Vohra,et al. Calibrated Learning and Correlated Equilibrium , 1996 .

[13] Yoav Freund,et al. Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[14] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .

[15] Frank Rosenblatt,et al. PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[16] Nicolò Cesa-Bianchi,et al. Analysis of Two Gradient-Based Algorithms for On-Line Regression , 1999 .

[17] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[18] Valerie Isham,et al. Non‐Negative Matrices and Markov Chains , 1983 .

[19] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .

[20] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[21] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .

[22] Yoram Singer,et al. Using and combining predictors that specialize , 1997, STOC '97.

[23] Manfred K. Warmuth,et al. Averaging Expert Predictions , 1999, EuroCOLT.

[24] H. D. Block. The perceptron: a model for brain functioning. I , 1962 .

[25] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .