Optimization and Identification in a Non-equilibrium Dynamic Game

In this paper, we consider optimization and identification problems in a non-equilibrium dynamic game. To be precise, we consider infinitely repeated games between a human and a machine based on the standard Prisoners' Dilemma model. The machine's strategy is assumed to be fixed with k-step memory, which may be unknown to the human. By analyzing the state transfer graph, it will be shown that the optimal strategy that maximizes the human's averaged payoff is actually periodic after finite steps. This can help us to find the optimal strategy in a feasible way. Moreover, when k = 1, the human will not lose to the machine while optimizing his averaged payoff; but when k ≥ 2, he may indeed lose if he focuses on optimizing his own payoff only. Identifiability problem will also be investigated when the machine's strategy is unknown to the human.

[1]  Peter Secretan Learning , 1965, Mental Health.

[2]  E. Kalai,et al.  Rational Learning Leads to Nash Equilibrium , 1993 .

[3]  H. Peyton Young,et al.  The Possible and the Impossible in Multi-Agent Learning , 2007, Artif. Intell..

[4]  Jason R. Marden,et al.  Joint Strategy Fictitious Play with Inertia for Potential Games , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[5]  Quo Lei ADAPTIVE SYSTEMS THEORY: SOME BASIC CONCEPTS, METHODS AND RESULTS , 2003 .

[6]  D. Fudenberg,et al.  Self-confirming equilibrium , 1993 .

[7]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[8]  Graham C. Goodwin,et al.  Adaptive filtering prediction and control , 1984 .

[9]  William H. Sandholm,et al.  The projection dynamic and the geometry of population games , 2008, Games Econ. Behav..

[10]  Lei Guo Self-convergence of weighted least-squares with applications to stochastic adaptive control , 1996, IEEE Trans. Autom. Control..

[11]  H. Peyton Young,et al.  Learning, hypothesis testing, and Nash equilibrium , 2003, Games Econ. Behav..

[12]  E. Kalai,et al.  Subjective Equilibrium in Repeated Games , 1993 .

[13]  Jason R. Marden,et al.  Payoff-Based Dynamics for Multiplayer Weakly Acyclic Games , 2009, SIAM J. Control. Optim..

[14]  GUOLei ADAPTIVE SYSTEMS THEORY: SOME BASIC CONCEPTS, METHODS AND RESULTS , 2003 .

[15]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[16]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[17]  W. Arthur,et al.  The Economy as an Evolving Complex System II , 1988 .

[18]  Han-Fu Chen,et al.  The AAstrom-Wittenmark self-tuning regulator revisited and ELS-based adaptive trackers , 1991 .

[19]  Yu-Han Chang No regrets about no-regret , 2007, Artif. Intell..

[20]  Minghao Tan,et al.  Nonlinear Adaptive Control Design for Affine Systems , 2007, Third International Conference on Natural Computation (ICNC 2007).

[21]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[22]  Lennart Ljung,et al.  Performance analysis of general tracking algorithms , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[23]  John R. Koza,et al.  Hidden Order: How Adaptation Builds Complexity. , 1995, Artificial Life.

[24]  Lei Guo,et al.  On critical stability of discrete-time adaptive nonlinear control , 1997, IEEE Trans. Autom. Control..

[25]  J. Hofbauer,et al.  Evolutionary game dynamics , 2011 .

[26]  Karl Johan Åström,et al.  Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.

[27]  Lei Guo,et al.  Adaptive continuous-time linear quadratic Gaussian control , 1999, IEEE Trans. Autom. Control..

[28]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[29]  D. Fudenberg,et al.  Learning and Equilibrium , 2009 .

[30]  Jason R. Marden,et al.  Payoff based dynamics for multi-player weakly acyclic games , 2007, 2007 46th IEEE Conference on Decision and Control.

[31]  John H. Holland,et al.  Studying Complex Adaptive Systems , 2006, J. Syst. Sci. Complex..

[32]  C. Cannings,et al.  Evolutionary Game Theory , 2010 .

[33]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .