Learning equilibria in repeated congestion games

While the class of congestion games has been thoroughly studied in the multi-agent systems literature, settings with incomplete information have received relatively little attention. In this paper we consider a setting in which the cost functions of resources in the congestion game are initially unknown. The agents gather information about these cost functions through repeated interaction, and observations of costs they incur. In this context we consider the following requirement: the agents' algorithms should themselves be in equilibrium, regardless of the actual cost functions and should lead to an efficient outcome. We prove that this requirement is achievable for a broad class of games: repeated symmetric congestion games. Our results are applicable even when agents are somewhat limited in their capacity to monitor the actions of their counterparts, or when they are unable to determine the exact cost they incur from every resource. On the other hand, we show that there exist asymmetric congestion games for which no such equilibrium can be found, not even an inefficient one. Finally we consider equilibria with resistance to the deviation of more than one player and show that these do not exist even in repeated resource selection games.

[1]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[2]  Ronen I. Brafman,et al.  Optimal Efficient Learning Equilibrium : Imperfect Monitoring , 2005 .

[3]  Itai Ashlagi,et al.  Learning Equilibrium in Resource Selection Games , 2007, AAAI.

[4]  Itai Ashlagi,et al.  Robust Learning Equilibrium , 2006, UAI.

[5]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[6]  A. Roth,et al.  Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .

[7]  Ronen I. Brafman,et al.  Efficient learning equilibrium , 2004, Artificial Intelligence.

[8]  Manuela M. Veloso,et al.  Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.

[9]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[10]  Moshe Tennenholtz,et al.  Learning equilibrium as a generalization of learning to optimize , 2007, Artif. Intell..

[11]  L. Shapley,et al.  Potential Games , 1994 .

[12]  Michael Schapira,et al.  Interdomain routing and games , 2008, SIAM J. Comput..

[13]  Martin Gairing,et al.  Selfish routing with incomplete information , 2005, SPAA.

[14]  Y. Narahari,et al.  Price of Anarchy of Network Routing Games with Incomplete Information , 2005, WINE.

[15]  David C. Parkes,et al.  Specification faithfulness in networks with rational nodes , 2004, PODC '04.

[16]  Vincent Conitzer,et al.  AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[17]  Christos H. Papadimitriou,et al.  Worst-case equilibria , 1999 .

[18]  Keith B. Hall,et al.  Correlated Q-Learning , 2003, ICML.

[19]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[20]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[21]  Ronen I. Brafman,et al.  Optimal Efficient Learning Equilibrium: Imperfect Monitoring in Symmetric Games , 2005, AAAI.

[22]  Itai Ashlagi,et al.  Resource selection games with unknown number of players , 2006, AAMAS '06.