A Predictive Theory of Games

Conventional noncooperative game theory hypothesizes that the joint (mixed) strategy of a set of reasoning players in a game will necessarily satisfy an “equilibrium concept”. The number of joint strategies satisfying that equilibrium concept has measure zero, and all other joint strategies are considered impossible. Under this hypothesis the only issue is what equilibrium concept is “correct”. This hypothesis violates the first-principles arguments underlying probability theory. Indeed, probability theory renders moot the controversy over what equilibrium concept is correct — while in general there are joint (mixed) strategies with zero probability, in general the set {strategies with non-zero probability} has measure greater than zero. Rather than a firstprinciples derivation of an equilibrium concept, game theory requires a first-principles derivation of a distribution over joint strategies. However say one wishes to predict a single joint strategy from that distribution. Then decision theory tell us to first specify a loss function, a function which concerns how we, the analyst/scientist external to the game, will use that prediction. We then predict that the game will result in the joint strategy that is Bayes-optimal for that loss function and distribution over joint strategies. Different loss functions — different uses of the prediction — give different such optimal predictions. There is no more role for an “equilibrium concept” that is independent of the distribution and choice of loss function. This application of probability theory to games, not just within games, is called Predictive Game Theory (PGT). This paper shows how information theory provides a first-principles argument for how to set a distribution over joint strategies. The connection of this distribution to the bounded rational Quantal Response Equilibrium (QRE) is elaborated. In particular, taking the QRE to be an approximation to the mode of the distribution, correction terms to the QRE are derived. In addition, some Nash equilibria are not approached by any limiting sequence of increasingly rational QRE joint strategies. However it is shown here that every Nash equilibrium is approached with a limiting sequence of joint strategies all of which have non-zero probability. (In general though not all strategies in those sequences are modes of the associated distributions over joint strategies.)

[1]  A. Tversky,et al.  Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[2]  R. Aumann Correlated Equilibrium as an Expression of Bayesian Rationality Author ( s ) , 1987 .

[3]  Masanao Aoki,et al.  Modeling Aggregate Behavior and Fluctuations in Economics: Stochastic Views of Interacting Agents , 2001 .

[4]  D. Fudenberg,et al.  Steady state learning and Nash equilibrium , 1993 .

[5]  Kagan Tumer,et al.  Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.

[6]  D. Wolpert RECONCILING BAYESIAN AND NON-BAYESIAN ANALYSIS , 1996 .

[7]  R. Aumann,et al.  Epistemic Conditions for Nash Equilibrium , 1995 .

[8]  D. Kahneman A Psychological Perspective on Economics , 2003 .

[9]  R. Duncan Luce,et al.  Individual Choice Behavior , 1959 .

[10]  J. Harsanyi Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points , 1973 .

[11]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[12]  M. Tribus,et al.  Probability theory: the logic of science , 2003 .

[13]  R. McAfee,et al.  Auctions with a stochastic number of bidders , 1987 .

[14]  A. Pentland,et al.  Collective intelligence , 2006, IEEE Comput. Intell. Mag..

[15]  Robert J. Aumann,et al.  Interactive epistemology II: Probability , 1999, Int. J. Game Theory.

[16]  T. Loredo From Laplace to Supernova SN 1987A: Bayesian Inference in Astrophysics , 1990 .

[17]  David H. Wolpert,et al.  What Information Theory Says About Best Response and About Binding Contracts , 2004 .

[18]  D. Kahneman Maps of Bounded Rationality: Psychology for Behavioral Economics , 2003 .

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Patrick D. Larkey,et al.  Subjective Probability and the Theory of Games , 1982 .

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  Arnold Zellner,et al.  Some aspects of the history of Bayesian information processing , 2007 .

[23]  S. Gull Bayesian Inductive Inference and Maximum Entropy , 1988 .

[24]  John A List,et al.  A simple test of expected utility theory using professional traders. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[26]  L. Goddard Information Theory , 1962, Nature.

[27]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[28]  Ilan Kroo,et al.  Fleet Assignment Using Collective Intelligence , 2004 .

[29]  John C. Harsanyi,et al.  Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .

[30]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[31]  David G. Stork,et al.  Pattern Classification , 1973 .

[32]  Kagan Tumer,et al.  Collectives and Design Complex Systems , 2004 .

[33]  David H. Wolpert,et al.  Product distribution theory for control of multi-agent systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[34]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[35]  A. Greif Economic History and Game Theory: A Survey , 1998 .

[36]  Kagan Tumer,et al.  Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[37]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[38]  Robert Kurzban,et al.  Experiments investigating cooperative types in humans: a complement to evolutionary theory and simulations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[39]  John Skilling,et al.  Maximum Entropy and Bayesian Methods , 1989 .

[40]  Jacob K. Goeree,et al.  Quantal Response Equilibrium and Overbidding in Private-Value Auctions , 2002, J. Econ. Theory.

[41]  David H. Wolpert,et al.  Discrete, Continuous, and Constrained Optimization Using Collectives , 2004 .

[42]  Kagan Tumer,et al.  Collective Intelligence, Data Routing and Braess' Paradox , 2002, J. Artif. Intell. Res..

[43]  Ken Binmore,et al.  Fun and games : a text on game theory , 1992 .

[44]  David H. Wolpert,et al.  Information Theory - The Bridge Connecting Bounded Rational Game Theory and Statistical Physics , 2004, ArXiv.

[45]  R. Duncan Luce,et al.  Whatever Happened to Information Theory in Psychology? , 2003 .

[46]  D. Wolpert The Bootstrap is Inconsistent with Probability Theory , 1996 .

[47]  Victor M. Yakovenko,et al.  Statistical mechanics of money , 2000 .

[48]  J. Paris The Uncertain Reasoner's Companion: A Mathematical Perspective , 1994 .

[49]  Steven Durlauf,et al.  How can statistical mechanics contribute to social science? , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[50]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[51]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[52]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[53]  David H. Wolpert,et al.  On Bias Plus Variance , 1997, Neural Computation.

[54]  R. McKelvey,et al.  Quantal Response Equilibria for Normal Form Games , 1995 .

[55]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[56]  David H. Wolpert,et al.  Adaptive, distributed control of constrained multi-agent systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[57]  F. Yong,et al.  Alpha + + , 1999 .

[58]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[59]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[60]  Kevin S. Van Horn,et al.  Constructing a logic of plausible inference: a guide to Cox's theorem , 2003, Int. J. Approx. Reason..

[61]  S. Hart,et al.  Handbook of Game Theory with Economic Applications , 1992 .

[62]  Colin Camerer Behavioral Game Theory: Experiments in Strategic Interaction , 2003 .

[63]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[64]  David H. Wolpert,et al.  Distributed control by Lagrangian steepest descent , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[65]  Jeff S. Shamma,et al.  Dynamic fictitious play, dynamic gradient play, and distributed convergence to Nash equilibria , 2005, IEEE Transactions on Automatic Control.

[66]  David H. Wolpert,et al.  Product Distribution Field Theory , 2003, ArXiv.

[67]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[68]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[69]  Martin Shubik,et al.  Is Economics the Next Physical Science , 2005, physics/0506086.

[70]  Jacob K. Goeree,et al.  A model of noisy introspection , 2004, Games Econ. Behav..

[71]  David H. Wolpert,et al.  Beyond Mechanism Design , 2004 .

[72]  Stefan R. Bieniawski,et al.  Adaptive Multi-Agent Systems for Constrained Optimization , 2004 .

[73]  D. R. Wolf,et al.  Alpha, Evidence, and the Entropic Prior , 1993 .

[74]  D. McFadden Econometric Models of Probabilistic Choice , 1981 .