Should I remember more than you? Best responses to factored strategies

In this paper we offer a new, unifying approach to modeling strategies of bounded complexity. In our model, the strategy of a player in a game does not directly map the set H of histories to the set of her actions. Instead, the player’s perception of H is represented by a map $$\varphi :H \rightarrow X,$$ where X reflects the “cognitive complexity” of the player, and the strategy chooses its mixed action at history h as a function of $$\varphi (h)$$ . In this case we say that $$\varphi $$ is a factor of a strategy and that the strategy is $$\varphi $$ -factored. Stationary strategies, strategies played by finite automata, and strategies with bounded recall are the most prominent examples of factored strategies in multistage games. A factor  $$\varphi $$ is recursive if its value at history $$h'$$ that follows history h is a function of $$\varphi (h)$$ and the incremental information $$h'\setminus h$$ . For example, in a repeated game with perfect monitoring, a factor $$\varphi $$ is recursive if its value $$\varphi (a_1,\ldots ,a_t)$$ on a finite string of action profiles $$(a_1,\ldots ,a_t)$$ is a function of $$\varphi (a_1,\ldots ,a_{t-1})$$ and $$a_t$$ .We prove that in a discounted infinitely repeated game and (more generally) in a stochastic game with finitely many actions and perfect monitoring, if the factor $$\varphi $$ is recursive, then for every profile of $$\varphi $$ -factored strategies there is a pure $$\varphi $$ -factored strategy that is a best reply, and if the stochastic game has finitely many states and actions and the factor $$\varphi $$ has a finite range then there is a pure $$\varphi $$ -factored strategy that is a best reply in all the discounted games with a sufficiently large discount factor.

[1]  R. Radner,et al.  An Example of a Repeated Partnership Game with Discounting and with Uniformly Inefficient Equilibria , 1986 .

[2]  Abraham Neyman,et al.  Cooperation, Repetition, and Automata , 1997 .

[3]  A. Rubinstein,et al.  The Structure of Nash Equilibrium in Repeated Games with Finite Automata (Now published in Econometrica, 56 (1988), pp.1259-1282.) , 1986 .

[4]  Eric Maskin,et al.  Markov Perfect Equilibrium: I. Observable Actions , 2001, J. Econ. Theory.

[5]  Daijiro Okada,et al.  Growth of strategy sets, entropy, and nonstationary bounded recall , 2009, Games Econ. Behav..

[6]  Ichiro Obara,et al.  Efficiency in Repeated Games Revisited: The Role of Private Strategies (with M. Kandori) , 2004 .

[7]  Abraham Neyman,et al.  From Markov Chains to Stochastic Games , 2003 .

[8]  E. Lehrer Repeated Games with Stationary Bounded Recall Strategies , 1988 .

[9]  Elchanan Ben-Porath,et al.  Repeated games with Finite Automata , 1993 .

[10]  A. Neyman Bounded complexity justifies cooperation in the finitely repeated prisoners' dilemma , 1985 .

[11]  D. Blackwell Discrete Dynamic Programming , 1962 .

[12]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[13]  R. Aumann,et al.  Cooperation and bounded recall , 1989 .

[14]  A. Rubinstein Finite automata play the repeated prisoner's dilemma , 1986 .

[15]  E. Kalai Bounded Rationality and Strategic Complexity in Repeated Games , 1987 .

[16]  E. Kalai,et al.  Finite Rationality and Interpersonal Complexity in Repeated Games , 1988 .

[17]  W. Press,et al.  Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent , 2012, Proceedings of the National Academy of Sciences.

[18]  C. Derman MARKOVIAN SEQUENTIAL CONTROL PROCESSES-DENUMERABLE STATE SPACE , 1965 .