论文信息 - Should I remember more than you? Best responses to factored strategies

Should I remember more than you? Best responses to factored strategies

In this paper we offer a new, unifying approach to modeling strategies of bounded complexity. In our model, the strategy of a player in a game does not directly map the set H of histories to the set of her actions. Instead, the player’s perception of H is represented by a map $$\varphi :H \rightarrow X,$$ where X reflects the “cognitive complexity” of the player, and the strategy chooses its mixed action at history h as a function of $$\varphi (h)$$ . In this case we say that $$\varphi $$ is a factor of a strategy and that the strategy is $$\varphi $$ -factored. Stationary strategies, strategies played by finite automata, and strategies with bounded recall are the most prominent examples of factored strategies in multistage games. A factor $$\varphi $$ is recursive if its value at history $$h'$$ that follows history h is a function of $$\varphi (h)$$ and the incremental information $$h'\setminus h$$ . For example, in a repeated game with perfect monitoring, a factor $$\varphi $$ is recursive if its value $$\varphi (a_1,\ldots ,a_t)$$ on a finite string of action profiles $$(a_1,\ldots ,a_t)$$ is a function of $$\varphi (a_1,\ldots ,a_{t-1})$$ and $$a_t$$ .We prove that in a discounted infinitely repeated game and (more generally) in a stochastic game with finitely many actions and perfect monitoring, if the factor $$\varphi $$ is recursive, then for every profile of $$\varphi $$ -factored strategies there is a pure $$\varphi $$ -factored strategy that is a best reply, and if the stochastic game has finitely many states and actions and the factor $$\varphi $$ has a finite range then there is a pure $$\varphi $$ -factored strategy that is a best reply in all the discounted games with a sufficiently large discount factor.

[1] R. Radner,et al. An Example of a Repeated Partnership Game with Discounting and with Uniformly Inefficient Equilibria , 1986 .

[2] Abraham Neyman,et al. Cooperation, Repetition, and Automata , 1997 .

[3] A. Rubinstein,et al. The Structure of Nash Equilibrium in Repeated Games with Finite Automata (Now published in Econometrica, 56 (1988), pp.1259-1282.) , 1986 .

[4] Eric Maskin,et al. Markov Perfect Equilibrium: I. Observable Actions , 2001, J. Econ. Theory.

[5] Daijiro Okada,et al. Growth of strategy sets, entropy, and nonstationary bounded recall , 2009, Games Econ. Behav..

[6] Ichiro Obara,et al. Efficiency in Repeated Games Revisited: The Role of Private Strategies (with M. Kandori) , 2004 .

[7] Abraham Neyman,et al. From Markov Chains to Stochastic Games , 2003 .

[8] E. Lehrer. Repeated Games with Stationary Bounded Recall Strategies , 1988 .

[9] Elchanan Ben-Porath,et al. Repeated games with Finite Automata , 1993 .

[10] A. Neyman. Bounded complexity justifies cooperation in the finitely repeated prisoners' dilemma , 1985 .

[11] D. Blackwell. Discrete Dynamic Programming , 1962 .

[12] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[13] R. Aumann,et al. Cooperation and bounded recall , 1989 .

[14] A. Rubinstein. Finite automata play the repeated prisoner's dilemma , 1986 .

[15] E. Kalai. Bounded Rationality and Strategic Complexity in Repeated Games , 1987 .

[16] E. Kalai,et al. Finite Rationality and Interpersonal Complexity in Repeated Games , 1988 .

[17] W. Press,et al. Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent , 2012, Proceedings of the National Academy of Sciences.

[18] C. Derman. MARKOVIAN SEQUENTIAL CONTROL PROCESSES-DENUMERABLE STATE SPACE , 1965 .