论文信息 - Smooth markets: A basic mechanism for organizing gradient-based learners - 字舞流文

Smooth markets: A basic mechanism for organizing gradient-based learners

With the success of modern machine learning, it is becoming increasingly important to understand and control how learning algorithms interact. Unfortunately, negative results from game theory show there is little hope of understanding or controlling general n-player games. We therefore introduce smooth markets (SM-games), a class of n-player games with pairwise zero sum interactions. SM-games codify a common design pattern in machine learning that includes some GANs, adversarial training, and other recent algorithms. We show that SM-games are amenable to analysis and optimization using first-order methods.

Joel Z. Leibo | Thore Graepel | Georgios Piliouras | David Balduzzi | Ian M. Gemp | Edward Hughes | Wojiech M. Czarnecki | Thomas W. Anthony | Wojciech M. Czarnecki | T. Graepel | Thomas W. Anthony | G. Piliouras | D. Balduzzi | I. Gemp | Edward Hughes

[1] William Vickrey,et al. Counterspeculation, Auctions, And Competitive Sealed Tenders , 1961 .

[2] Nathan Lay,et al. Supervised Aggregation of Classifiers using Artificial Prediction Markets , 2010, ICML.

[3] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[4] Thore Graepel,et al. Differentiable Game Mechanics , 2019, J. Mach. Learn. Res..

[5] M. Minsky. The Society of Mind , 1986 .

[6] Sridhar Mahadevan,et al. Online Monotone Games , 2017, ArXiv.

[7] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[8] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[9] Yan Wu,et al. LOGAN: Latent Optimisation for Generative Adversarial Networks , 2019, ArXiv.

[10] Michael P. Wellman,et al. Economic reasoning and artificial intelligence , 2015, Science.

[11] Sridhar Mahadevan,et al. Global Convergence to the Equilibrium of GANs using Variational Inequalities , 2018, ArXiv.

[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13] Georgios Piliouras,et al. Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent , 2019, COLT.

[14] Maryam Kamgarpour,et al. Learning Generalized Nash Equilibria in a Class of Convex Games , 2017, IEEE Transactions on Automatic Control.

[15] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[16] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[17] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.

[18] Uriel G. Rothblum,et al. Accuracy Certificates for Computational Problems with Convex Structure , 2010, Math. Oper. Res..

[19] Jürgen Schmidhuber,et al. Market-Based Reinforcement Learning in Partially Observable Worlds , 2001, ICANN.

[20] Michael L. Littman,et al. Graphical Models for Game Theory , 2001, UAI.

[21] S. Hart,et al. Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[22] Eric B. Baum,et al. Toward a Model of Intelligence as an Economy of Agents , 1999, Machine Learning.

[23] Pascal Vincent,et al. A Closer Look at the Optimization Landscapes of Generative Adversarial Networks , 2019, ICLR.

[24] Amos J. Storkey,et al. Multi-period Trading Prediction Markets with Connections to Machine Learning , 2014, ICML.

[25] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[26] Lars M. Mescheder,et al. On the convergence properties of GAN training , 2018, ArXiv.

[27] Georgios Piliouras,et al. Multiplicative Weights Update in Zero-Sum Games , 2018, EC.

[28] Luis E. Ortiz,et al. Economic Properties of Social Networks , 2004, NIPS.

[29] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[30] Paul W. Goldberg,et al. The Complexity of Computing a Nash Equilibrium , 2009, SIAM J. Comput..

[31] Sebastian Nowozin,et al. The Numerics of GANs , 2017, NIPS.

[32] Adam Smith,et al. The Wealth of Nations , 1999 .

[33] Yakov Babichenko,et al. Query complexity of approximate nash equilibria , 2013, STOC.

[34] Michael P. Wellman. Market-aware agents for a multiagent world , 1997, Robotics Auton. Syst..

[35] Yang Cai,et al. Zero-Sum Polymatrix Games: A Generalization of Minmax , 2016, Math. Oper. Res..

[36] James C. Scott,et al. Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed , 1999 .

[37] Jacob D. Abernethy,et al. A Collaborative Mechanism for Crowdsourcing Prediction Problems , 2011, NIPS.

[38] Georgios Piliouras,et al. Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos , 2017, NIPS.

[39] Cynthia Breazeal,et al. Machine behaviour , 2019, Nature.

[40] Jacob Abernethy,et al. Last-iterate convergence rates for min-max optimization , 2019, ALT.

[41] Amos J. Storkey,et al. Machine Learning Markets , 2011, AISTATS.

[42] J. Neumann,et al. Theory of Games and Economic Behavior. , 1945 .

[43] Luis E. Ortiz,et al. Graphical Economics , 2004, COLT.

[44] Amos J. Storkey,et al. Isoelastic Agents and Wealth Updates in Machine Learning Markets , 2012, ICML.

[45] Ioannis Mitliagkas,et al. Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[46] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.

[47] O. G. Selfridge,et al. Pandemonium: a paradigm for learning , 1988 .

[48] David Balduzzi,et al. Cortical prediction markets , 2014, AAMAS.

[49] J. Neumann. Zur Theorie der Gesellschaftsspiele , 1928 .

[50] L. Shapley,et al. Potential Games , 1994 .