论文信息 - No-regret learning in convex games

No-regret learning in convex games

Quite a bit is known about minimizing different kinds of regret in experts problems, and how these regret types relate to types of equilibria in the multiagent setting of repeated matrix games. Much less is known about the possible kinds of regret in online convex programming problems (OCPs), or about equilibria in the analogous multiagent setting of repeated convex games. This gap is unfortunate, since convex games are much more expressive than matrix games, and since many important machine learning problems can be expressed as OCPs. In this paper, we work to close this gap: we analyze a spectrum of regret types which lie between external and swap regret, along with their corresponding equilibria, which lie between coarse correlated and correlated equilibrium. We also analyze algorithms for minimizing these regret types. As examples of our framework, we derive algorithms for learning correlated equilibria in polyhedral convex games and extensive-form correlated equilibria in extensive-form games. The former is exponentially more efficient than previous algorithms, and the latter is the first of its type.

Geoffrey J. Gordon | Amy Greenwald | Casey Marks | A. Greenwald | Casey Marks

[1] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .

[2] Geoffrey J. Gordon. Regret bounds for prediction problems , 1999, COLT '99.

[3] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .

[4] B. Stengel,et al. Computationally efficient coordination in game trees , 2002 .

[5] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[6] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[7] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[8] Yoram Singer,et al. Convex Repeated Games and Fenchel Duality , 2006, NIPS.

[9] Geoffrey J. Gordon. No-regret Algorithms for Online Convex Programs , 2006, NIPS.

[10] Elad Hazan,et al. Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria , 2007, NIPS.

[11] Yishay Mansour,et al. From External to Internal Regret , 2005, J. Mach. Learn. Res..

[12] Gábor Lugosi,et al. Learning correlated equilibria in games with compact sets of strategies , 2007, Games Econ. Behav..

[13] Casey Marks. No-Regret Learning and Game-Theoretic Equilibria , 2008 .