Robust Learning Equilibrium

We introduce robust learning equilibrium. The idea of learning equilibrium is that learning algorithms in multi-agent systems should themselves be in equilibrium rather than only lead to equilibrium. That is, learning equilibrium is immune to strategic deviations: Every agent is better off using its prescribed learning algorithm, if all other agents follow their algorithms, regardless of the unknown state of the environment. However, a learning equilibrium may not be immune to non strategic mistakes. For example, if for a certain period of time there is a failure in the monitoring devices (e.g., the correct input does not reach the agents), then it may not be in equilibrium to follow the algorithm after the devices are corrected. A robust learning equilibrium is immune also to such non-strategic mistakes. The existence of (robust) learning equilibrium is especially challenging when the monitoring devices are 'weak'. That is, the information available to each agent at each stage is limited. We initiate a study of robust learning equilibrium with general monitoring structure and apply it to the context of auctions. We prove the existence of robust learning equilibrium in repeated first-price auctions, and discuss its properties.

[1]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[2]  A. Roth,et al.  Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .

[3]  Ronen I. Brafman,et al.  Efficient learning equilibrium , 2004, Artificial Intelligence.

[4]  Itai Ashlagi,et al.  Resource selection games with unknown number of players , 2006, AAMAS '06.

[5]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[6]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[7]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[8]  Dov Monderer,et al.  A Learning Approach to Auctions , 1998 .

[9]  Vincent Conitzer,et al.  AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[10]  Keith B. Hall,et al.  Correlated Q-Learning , 2003, ICML.

[11]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[12]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[13]  E. Kalai,et al.  Rational Learning Leads to Nash Equilibrium , 1993 .

[14]  Andreu Mas-Colell,et al.  A General Class of Adaptive Strategies , 1999, J. Econ. Theory.

[15]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[16]  Manuela M. Veloso,et al.  Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.