论文信息 - Learning efficient Nash equilibria in distributed systems

Learning efficient Nash equilibria in distributed systems

An individualʼs learning rule is completely uncoupled if it does not depend directly on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient (welfare-maximizing) pure Nash equilibrium in all generic n-person games that possess at least one pure Nash equilibrium. In games that do not have such an equilibrium, there is a simple formula that expresses the long-run probability of the various disequilibrium states in terms of two factors: (i) the sum of payoffs over all agents, and (ii) the maximum payoff gain that results from a unilateral deviation by some agent. This welfare/stability trade-off criterion provides a novel framework for analyzing the selection of disequilibrium as well as equilibrium states in n-person games.

H. Peyton Young | Bary S. R. Pradelski | H. Young

[1] Jason R. Marden,et al. Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation , 2010 .

[2] Jason R. Marden,et al. Payoff-Based Dynamics for Multiplayer Weakly Acyclic Games , 2009, SIAM J. Control. Optim..

[3] J. Hofbauer,et al. Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[4] Jason R. Marden,et al. Achieving Pareto Optimality Through Distributed Learning , 2011 .

[5] Shie Mannor,et al. Multi-agent learning for engineers , 2007, Artif. Intell..

[6] Tim Roughgarden,et al. Selfish routing and the price of anarchy , 2005 .

[7] Yakov Babichenko,et al. Completely uncoupled dynamics and Nash equilibria , 2012, Games Econ. Behav..

[8] L. Blume. The Statistical Mechanics of Strategic Interaction , 1993 .

[9] H. Young,et al. Individual Strategy and Social Structure: An Evolutionary Theory of Institutions , 1999 .

[10] Yakov Babichenko,et al. How long to Pareto efficiency? , 2014, Int. J. Game Theory.

[11] William H. Sandholm,et al. Evolutionary Implementation and Congestion Pricing , 2002 .

[12] R. Rob,et al. Learning, Mutation, and Long Run Equilibria in Games , 1993 .

[13] H. Peyton Young,et al. Learning by trial and error , 2009, Games Econ. Behav..

[14] Devavrat Shah,et al. Dynamics in congestion games , 2010, SIGMETRICS '10.

[15] Jason R. Marden,et al. Cooperative Control and Potential Games , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16] Jason R. Marden,et al. Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[17] Yakov Babichenko,et al. Average Testing and the Efficient Boundary , 2011 .

[18] Lawrence E. Blume,et al. How noise matters , 2003, Games Econ. Behav..

[19] Andreu Mas-Colell,et al. Stochastic Uncoupled Dynamics and Nash Equilibrium , 2004, Games Econ. Behav..

[20] H. Young,et al. The Evolution of Conventions , 1993 .

[21] H. Peyton Young,et al. Learning, hypothesis testing, and Nash equilibrium , 2003, Games Econ. Behav..

[22] Yishay Mansour,et al. How long to equilibrium? The communication complexity of uncoupled equilibrium procedures , 2010, Games Econ. Behav..

[23] L. Blume. The Statistical Mechanics of Best-Response Strategy Revision , 1995 .

[24] B. Peleg,et al. Automata, matching and foraging behavior of bees , 1995 .

[25] Debraj Ray,et al. Evolving Aspirations and Cooperation , 1998 .

[26] Dean Phillips Foster,et al. Regret Testing: Learning to Play Nash Equilibrium Without Knowing You Have an Opponent , 2006 .

[27] Uzi Motro,et al. NEAR-FAR SEARCH : AN EVOLUTIONARILY STABLE FORAGING STRATEGY , 1995 .

[28] H. Peyton Young,et al. Individual Strategy and Social Structure , 2020 .

[29] Amin Saberi,et al. On the Inefficiency Ratio of Stable Equilibria in Congestion Games , 2009, WINE.

[30] Christos H. Papadimitriou,et al. Algorithms, Games, and the Internet , 2001, ICALP.

[31] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[32] G. Lugosi,et al. Global Nash Convergence of Foster and Young's Regret Testing , 2004 .