Dynamic Opponent Modelling in Fictitious Play

Distributed optimization can be formulated as an n-player coordination game. One of the most common learning techniques in game theory is fictitious play and its variations. However, fictitious play is founded on an implicit assumption that opponents’ strategies are stationary. In this paper we present a new variation of fictitious play in which players predict opponents’ strategy using a particle filter algorithm. This allows us to use a more realistic model of opponent strategy. We used pre-specified opponents’ strategies to examine if our algorithm can efficiently track the strategies. Furthermore, we have used these experiments to examine the impact of different values of our algorithm parameters on the results of strategy tracking. We then compared the results of the proposed algorithm with those of stochastic and geometric fictitious play in three different strategic form games: a potential game and two climbing hill games, one with two players and the other with three players. We also tested our algorithm in two different distributed optimization scenarios, a vehicle-target assignment game and a disaster management problem. Our algorithm converges to the optimum faster than both the competitor algorithms in the strategic form games and the vehicle-target assignment game. Hence by placing a greater computational demand on the individual agents, less communication is required between the agents. In the disaster management scenario we compared the results of particle filter fictitious play with the ones of Matlab's centralized algorithm bintprog and the centralized pre-planning algorithm of (Gelenbe, E. and Timotheou, S. (2008) Random neural networks with synchronized interactions. Neural Comput., 20(9), 2308–2324). In this scenario our algorithm performed better than the pre-planning algorithm in two of the three performance measures we used.

[1]  Makoto Yokoo,et al.  Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[2]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.

[3]  Alex Rogers,et al.  A multi-agent simulation system for prediction and scheduling of aero engine overhaul , 2008, AAMAS.

[4]  Nicholas R. Jennings,et al.  Decentralized control of adaptive sampling in wireless sensor networks , 2009, TOSN.

[5]  Jason R. Marden,et al.  Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[6]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[7]  Josef Hofbauer,et al.  Learning in games with unstable equilibria , 2005, J. Econ. Theory.

[8]  A. W. Tucker,et al.  Advances in game theory , 1964 .

[9]  J. Harsanyi Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points , 1973 .

[10]  John Nachbar “Evolutionary” selection dynamics in games: Convergence and limit properties , 1990 .

[11]  Henk Hesselink,et al.  Scheduling Aircraft Using Constraint Satisfaction , 2002, WFLP.

[12]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[13]  Erol Gelenbe,et al.  Random Neural Networks with Synchronized Interactions , 2008, Neural Computation.

[14]  宮沢 光一 On the convergence of the learning process in a 2 x 2 non-zero-sum two-person game , 1961 .

[15]  William H. Sandholm,et al.  ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .

[16]  J. Walrand,et al.  Distributed Dynamic Programming , 2022 .

[17]  J. Robinson AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[18]  L. Shapley,et al.  Potential Games , 1994 .

[19]  M. Yokoo,et al.  Distributed Breakout Algorithm for Solving Distributed Constraint Satisfaction Problems , 1996 .

[20]  Steven Reece,et al.  On Similarities between Inference in Game Theory and Machine Learning , 2008, J. Artif. Intell. Res..

[21]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[22]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[23]  V. R. Lesser,et al.  Asynchronous Partial Overlay: A New Algorithm for Solving Distributed Constraint Satisfaction Problems , 2011, J. Artif. Intell. Res..

[24]  Archie C. Chapman,et al.  A Parameterisation of Algorithms for Distributed Constraint Optimisation via Potential Games , 2008 .

[25]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[26]  Thomas A. Runkler,et al.  Distributed supply chain management using ant colony optimization , 2009, Eur. J. Oper. Res..

[27]  Hiroaki Kitano,et al.  RoboCup Rescue: search and rescue in large-scale disasters as a domain for autonomous agents research , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[28]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[29]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[30]  Boi Faltings,et al.  A Scalable Method for Multiagent Constraint Optimization , 2005, IJCAI.