A few good agents: multi-agent social learning

In this paper, we investigate multi-agent learning (MAL) in a multi-agent resource selection problem (MARS) in which a large group of agents are competing for common resources. Since agents in such a setting are self-interested, MAL in MARS domains typically focuses on the convergence to a set of non-cooperative equilibria. As seen in the example of prisoner's dilemma, however, selfish equilibria are not necessarily optimal with respect to the natural objective function of a target problem, e.g., resource utilization in the case of MARS. Conversely, a centrally administered optimization of physically distributed agents is infeasible in many real-life applications such as transportation traffic problems. In order to explore the possibility for a middle ground solution, we analyze two types of costs for evaluating MAL algorithms in this context. The quality loss of a selfish algorithm can be quantitatively measured by the price of anarchy, i.e., the ratio of the objective function value of a selfish solution to that of an optimal solution. Analogously, we introduce the price of monarchy of a learning algorithm to quantify the practical cost of coordination in terms of communication cost. We then introduce a multi-agent social learning approach named A Few Good Agents (AFGA) that motivates self-interested agents to cooperate with one another to reduce the price of anarchy, while bounding the price of monarchy at the same time. A preliminary set of experiments on the El Farol bar problem, a simple example of MARS, show promising results.

[1]  Christos H. Papadimitriou,et al.  Worst-case equilibria , 1999 .

[2]  A. C. Pigou Economics of welfare , 1920 .

[3]  Shahar Dobzinski,et al.  Welfare Maximization in Congestion Games , 2006, IEEE Journal on Selected Areas in Communications.

[4]  Igal Milchtaich,et al.  Social optimality and cooperation in nonatomic congestion games , 2004, J. Econ. Theory.

[5]  A. Greenwald,et al.  Learning to play network games: does rationality yield nash equilibrium? , 1998 .

[6]  Elias Koutsoupias,et al.  On the Price of Anarchy and Stability of Correlated Equilibria of Linear Congestion Games , 2005, ESA.

[7]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[8]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[9]  Keith B. Hall,et al.  Fair and Efficient Solutions to the Santa Fe Bar Problem , 1910 .

[10]  Ann Nowé,et al.  Homo Egualis Reinforcement Learning Agents for Load Balancing , 2002, WRAC.

[11]  Tim Roughgarden Stackelberg Scheduling Strategies , 2004, SIAM J. Comput..

[12]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[13]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[14]  A. Greenwald Learning to Play Network Games , 1998 .

[15]  Yishay Mansour,et al.  From External to Internal Regret , 2005, J. Mach. Learn. Res..

[16]  Aranyak Mehta,et al.  Fairness and optimality in congestion games , 2005, EC '05.

[17]  R. Aumann Correlated Equilibrium as an Expression of Bayesian Rationality Author ( s ) , 1987 .

[18]  Vincent Conitzer,et al.  AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[19]  Ariel Orda,et al.  Achieving network optima using Stackelberg routing strategies , 1997, TNET.

[20]  W. Arthur Inductive Reasoning and Bounded Rationality , 1994 .

[21]  W. Arthur,et al.  Inductive Reasoning and Bounded Rationality ( The El Farol Problem ) , 1999 .

[22]  S. Hart,et al.  A General Class of Adaptive Strategies , 1999 .

[23]  Tim Roughgarden,et al.  Selfish routing and the price of anarchy , 2005 .

[24]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.