Distributed Fictitious Play for Multiagent Systems in Uncertain Environments

A multiagent system operates in an uncertain environment about which agents have different and time-varying probability distributions on the environment—beliefs—that, as time progresses, converge to a common probability distribution, i.e., an asymptotic belief. A global utility function that depends on the realized state of the environment and actions of all the agents determines the system's optimal behavior. We define the asymptotically optimal action profile as a Nash equilibrium of the potential game defined by considering the expected utility with respect to the asymptotic belief. At finite time, however, agents have not entirely congruous beliefs about the state of the environment and may select conflicting actions. This paper proposes a variation of the fictitious play algorithm, which is proven to converge to equilibrium actions if the state beliefs converge to a common distribution at a rate that is at least linear. In conventional fictitious play, agents infer about others’ future behavior by computing histograms of past actions and best respond to their expected payoffs integrated with respect to these histograms. In the variations developed here, histograms are built using knowledge of actions taken by nearby nodes and best responses are further integrated with respect to the local beliefs on the state of the environment. We exemplify the use of the algorithm in coordination games.

[1]  S. Hart Adaptive Heuristics , 2005 .

[2]  Alejandro Ribeiro,et al.  Bayesian Quadratic Network Game Filters , 2013, IEEE Transactions on Signal Processing.

[3]  Shahin Shahrampour,et al.  Exponentially fast parameter estimation in networks using distributed dual averaging , 2013, 52nd IEEE Conference on Decision and Control.

[4]  L. Shapley,et al.  Potential Games , 1994 .

[5]  Jason R. Marden,et al.  Joint Strategy Fictitious Play with Inertia for Potential Games , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[6]  R. Durrett Probability: Theory and Examples , 1993 .

[7]  L. Shapley,et al.  Fictitious Play Property for Games with Identical Interests , 1996 .

[8]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[9]  Drew Fudenberg,et al.  Heterogeneous beliefs and local information in stochastic fictitious play , 2011, Games Econ. Behav..

[10]  Drew Fudenberg,et al.  Learning to Play Bayesian Games , 2001, Games Econ. Behav..

[11]  Christos H. Papadimitriou,et al.  On the Complexity of the Parity Argument and Other Inefficient Proofs of Existence , 1994, J. Comput. Syst. Sci..

[12]  Jason R. Marden,et al.  Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[13]  J. Gillis,et al.  Matrix Iterative Analysis , 1961 .

[14]  Jeff S. Shamma,et al.  Dynamic fictitious play, dynamic gradient play, and distributed convergence to Nash equilibria , 2005, IEEE Transactions on Automatic Control.

[15]  Angelia Nedic,et al.  Convergence Rate of Distributed Averaging Dynamics and Optimization in Networks , 2015, Found. Trends Syst. Control..

[16]  Petar M. Djuric,et al.  Distributed Bayesian learning in multiagent systems: Improving our understanding of its capabilities and limitations , 2012, IEEE Signal Processing Magazine.

[17]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[18]  Soummya Kar,et al.  Empirical Centroid Fictitious Play: An Approach for Distributed Learning in Multi-Agent Games , 2013, IEEE Transactions on Signal Processing.

[19]  Angelia Nedic,et al.  A gossip algorithm for aggregative games on graphs , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[20]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .