论文信息 - Potential-Based Algorithms in On-Line Prediction and Game Theory - 字舞流文

Potential-Based Algorithms in On-Line Prediction and Game Theory

In this paper we show that several known algorithms for sequential prediction problems (including Weighted Majority and the quasi-additive family of Grove, Littlestone, and Schuurmans), for playing iterated games (including Freund and Schapire's Hedge and MW, as well as the Λ-strategies of Hart and Mas-Colell), and for boosting (including AdaBoost) are special cases of a general decision strategy based on the notion of potential. By analyzing this strategy we derive known performance bounds, as well as new bounds, as simple corollaries of a single general theorem. Besides offering a new and unified view on a large family of algorithms, we establish a connection between potential-based analysis in learning and their counterparts independently developed in game theory. By exploiting this connection, we show that certain learning problems are instances of more general game-theoretic problems. In particular, we describe a notion of generalized regret andshow its applications in learning theory.

Nicolò Cesa-Bianchi | Gábor Lugosi | G. Lugosi | Nicolò Cesa-Bianchi | N. Cesa-Bianchi

[1] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .

[2] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[3] A. A. Mullin,et al. Principles of neurodynamics , 1962 .

[4] H. D. Block. The perceptron: a model for brain functioning. I , 1962 .

[5] Albert B Novikoff,et al. ON CONVERGENCE PROOFS FOR PERCEPTRONS , 1963 .

[6] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[7] Valerie Isham,et al. Non‐Negative Matrices and Markov Chains , 1983 .

[8] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[9] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.

[10] N. Littlestone. Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[11] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[12] David Haussler,et al. How to use expert advice , 1993, STOC.

[13] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .

[14] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15] Vladimir Vovk,et al. A game of prediction with expert advice , 1995, COLT '95.

[16] Dean P. Foster,et al. Calibrated Learning and Correlated Equilibrium , 1997 .

[17] Yoram Singer,et al. Context-sensitive learning methods for text categorization , 1996, SIGIR '96.

[18] Manfred K. Warmuth,et al. How to use expert advice , 1997, JACM.

[19] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .

[20] Yoram Singer,et al. Using and combining predictors that specialize , 1997, STOC '97.

[21] Dale Schuurmans,et al. General Convergence Results for Linear Discriminant Updates , 1997, COLT '97.

[22] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[23] David Haussler,et al. Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[24] Claudio Gentile,et al. Linear Hinge Loss and Average Margin , 1998, NIPS.

[25] Yoav Freund,et al. Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[26] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .

[27] Claudio Gentile,et al. The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[28] Nicolò Cesa-Bianchi,et al. Analysis of Two Gradient-Based Algorithms for On-Line Regression , 1999 .

[29] D. Fudenberg,et al. Conditional Universal Consistency , 1999 .

[30] Manfred K. Warmuth,et al. Averaging Expert Predictions , 1999, EuroCOLT.

[31] S. Hart,et al. A General Class of Adaptive Strategies , 1999 .

[32] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .

[33] V. Vovk. Competitive On‐line Statistics , 2001 .

[34] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[35] Ehud Lehrer,et al. A wide range no-regret theorem , 2003, Games Econ. Behav..

[36] Manfred K. Warmuth,et al. Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.

[37] Robert E. Schapire,et al. Drifting Games , 1999, COLT '99.

[38] Claudio Gentile,et al. A Second-Order Perceptron Algorithm , 2002, SIAM J. Comput..