Regret in the On-Line Decision Problem

Abstract At each point in time a decision maker must make a decision. The payoff in a period from the decision made depends on the decision as well as on the state of the world that obtains at that time. The difficulty is that the decision must be made in advance of any knowledge, even probabilistic, about which state of the world will obtain. A range of problems from a variety of disciplines can be framed in this way. In this paper we survey the main results obtained, as well as some of their applications. Journal of Economic Literature Classification Numbers: C70, C73.

[1]  David Haussler,et al.  Tight worst-case loss bounds for predicting with expert advice , 1994, EuroCOLT.

[2]  David Haussler,et al.  How to use expert advice , 1993, STOC '93.

[3]  S. Hart,et al.  A Simple Adaptive Procedure Leading to Correlated Equilibrium , 1997 .

[4]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[5]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[6]  Dean P. Foster,et al.  A Randomization Rule for Selecting Forecasts , 1993, Oper. Res..

[7]  David Easley,et al.  Choice without beliefs , 1999 .

[8]  Conditional Universal Consistency , 1999 .

[9]  Philip Wolfe,et al.  Contributions to the theory of games , 1960 .

[10]  D. Fudenberg,et al.  Consistency and Cautious Fictitious Play , 1995 .

[11]  T. Cover Universal Portfolios , 1996 .

[12]  Manfred K. Warmuth,et al.  Using experts for predicting continuous outcomes , 1994, EuroCOLT.

[13]  Yoav Freund,et al.  Game theory, on-line prediction and boosting , 1996, COLT '96.

[14]  Dorit S. Hochba Approximation Algorithms for NP-Hard Problems , 1997, SIGA.

[15]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[16]  Alfredo De Santis,et al.  Learning probabilistic prediction functions , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[17]  D. Fudenberg,et al.  An Easier Way to Calibrate , 1999 .

[18]  Dean P. Foster,et al.  Calibrated Learning and Correlated Equilibrium , 1996 .

[19]  Dean Phillips Foster Prediction in the Worst Case , 1991 .

[20]  N. Megiddo On repeated games with incomplete information played by non-Bayesian players , 1980 .

[21]  Paul M. B. Vitányi Proceedings of the Second European Conference on Computational Learning Theory , 1995 .

[22]  Thomas H. Chung,et al.  Approximate methods for sequential decision making using expert advice , 1994, COLT '94.

[23]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[24]  Howard Raiffa,et al.  Games And Decisions , 1958 .

[25]  Neri Merhav,et al.  Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[26]  N. Vieille,et al.  Weak Approachability , 1992, Math. Oper. Res..

[27]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[28]  On Pseudo-Games , 1968 .