论文信息 - Dynamic Opponent Modelling in Fictitious Play

Dynamic Opponent Modelling in Fictitious Play

Distributed optimization can be formulated as an n-player coordination game. One of the most common learning techniques in game theory is fictitious play and its variations. However, fictitious play is founded on an implicit assumption that opponents’ strategies are stationary. In this paper we present a new variation of fictitious play in which players predict opponents’ strategy using a particle filter algorithm. This allows us to use a more realistic model of opponent strategy. We used pre-specified opponents’ strategies to examine if our algorithm can efficiently track the strategies. Furthermore, we have used these experiments to examine the impact of different values of our algorithm parameters on the results of strategy tracking. We then compared the results of the proposed algorithm with those of stochastic and geometric fictitious play in three different strategic form games: a potential game and two climbing hill games, one with two players and the other with three players. We also tested our algorithm in two different distributed optimization scenarios, a vehicle-target assignment game and a disaster management problem. Our algorithm converges to the optimum faster than both the competitor algorithms in the strategic form games and the vehicle-target assignment game. Hence by placing a greater computational demand on the individual agents, less communication is required between the agents. In the disaster management scenario we compared the results of particle filter fictitious play with the ones of Matlab's centralized algorithm bintprog and the centralized pre-planning algorithm of (Gelenbe, E. and Timotheou, S. (2008) Random neural networks with synchronized interactions. Neural Comput., 20(9), 2308–2324). In this scenario our algorithm performed better than the pre-planning algorithm in two of the three performance measures we used.

David S. Leslie | Michalis Smyrnakis

[1] Makoto Yokoo,et al. Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[2] C. Q. Lee,et al. The Computer Journal , 1958, Nature.

[3] Alex Rogers,et al. A multi-agent simulation system for prediction and scheduling of aero engine overhaul , 2008, AAMAS.

[4] Nicholas R. Jennings,et al. Decentralized control of adaptive sampling in wireless sensor networks , 2009, TOSN.

[5] Jason R. Marden,et al. Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[6] Neil J. Gordon,et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[7] Josef Hofbauer,et al. Learning in games with unstable equilibria , 2005, J. Econ. Theory.

[8] A. W. Tucker,et al. Advances in game theory , 1964 .

[9] J. Harsanyi. Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points , 1973 .

[10] John Nachbar. “Evolutionary” selection dynamics in games: Convergence and limit properties , 1990 .

[11] Henk Hesselink,et al. Scheduling Aircraft Using Constraint Satisfaction , 2002, WFLP.