论文信息 - On No-Regret Learning, Fictitious Play, and Nash Equilibrium

On No-Regret Learning, Fictitious Play, and Nash Equilibrium

This paper addresses the question what is the outcome of multi-agent learning via no-regret algorithms in repeated games? Speci cally, can the outcome of no-regret learning be characterized by traditional game-theoretic solution concepts, such as Nash equilibrium? The conclusion of this study is that no-regret learning is reminiscent of ctitious play: play converges to Nash equilibrium in dominancesolvable, constant-sum, and generalsum 2 2 games, but cycles exponentially in the Shapley game. Notably, however, the information required of ctitious play far exceeds that of noregret learning.

[1] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[2] H. Moulin,et al. Serial Cost Sharing , 1992 .

[3] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[4] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .

[6] A. Greenwald,et al. The Santa Fe bar problem revisited: Theoretical and practical implications , 1998 .

[7] S. Hart,et al. A General Class of Adaptive Strategies , 1999 .

[8] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .

[9] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[10] Jeffrey O. Kephart,et al. Probabilistic pricebots , 2001, AGENTS '01.

[11] Yan Chen. An experimental study of serial and average cost pricing mechanisms , 2003 .