Open-ended Learning in Symmetric Zero-sum Games
暂无分享,去创建一个
Max Jaderberg | Wojciech Czarnecki | Thore Graepel | Yoram Bachrach | David Balduzzi | Julien Pérolat | Marta Garnelo | Yoram Bachrach | Wojciech M. Czarnecki | Max Jaderberg | T. Graepel | J. Pérolat | D. Balduzzi | M. Garnelo
[1] Lawrence Freedman. The Problem of Strategy , 1980 .
[2] W. Daniel Hillis,et al. Co-evolving parasites improve simulated evolution as an optimization procedure , 1990 .
[3] Peter J. Fleming,et al. Genetic Algorithms for Multiobjective Optimization: FormulationDiscussion and Generalization , 1993, ICGA.
[4] C. Fonseca,et al. GENETIC ALGORITHMS FOR MULTI-OBJECTIVE OPTIMIZATION: FORMULATION, DISCUSSION, AND GENERALIZATION , 1993 .
[5] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[6] Richard K. Belew,et al. New Methods for Competitive Coevolution , 1997, Evolutionary Computation.
[7] Kaisa Miettinen,et al. Nonlinear multiobjective optimization , 1998, International series in operations research and management science.
[8] Stefano Nolfi,et al. Co-evolving predator and prey robots , 1998, Artificial Life.
[9] Jordan B. Pollack,et al. A Game-Theoretic Approach to the Simple Coevolutionary Algorithm , 2000, PPSN.
[10] Jordan B. Pollack,et al. Pareto Optimality in Coevolutionary Learning , 2001, ECAL.
[11] Avrim Blum,et al. Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.
[12] David S. Leslie,et al. Generalised weakened fictitious play , 2006, Games Econ. Behav..
[13] B. Roberson. The Colonel Blotto game , 2006 .
[14] Michael P. Wellman. Methods for Empirical Game-Theoretic Analysis , 2006, AAAI.
[15] Risto Miikkulainen,et al. Coevolution of neural networks using a layered pareto archive , 2006, GECCO.
[16] Michael H. Bowling,et al. A New Algorithm for Generating Equilibria in Massive Zero-Sum Games , 2007, AAAI.
[17] Edwin D. de Jong,et al. A Monotonic Archive for Pareto-Coevolution , 2007, Evolutionary Computation.
[18] Sergiu Hart,et al. Discrete Colonel Blotto and General Lotto games , 2008, Int. J. Game Theory.
[19] Peter Bro Miltersen,et al. On Range of Skill , 2008, AAAI.
[20] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[21] Hod Lipson,et al. Coevolution of Fitness Predictors , 2008, IEEE Transactions on Evolutionary Computation.
[22] Kenneth O. Stanley,et al. Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.
[23] Yuan Yao,et al. Statistical ranking and combinatorial Hodge theory , 2008, Math. Program..
[24] Asuman E. Ozdaglar,et al. Flows and Decompositions of Games: Harmonic and Potential Games , 2010, Math. Oper. Res..
[25] Edwin D. de Jong,et al. Coevolutionary Principles , 2012, Handbook of Natural Computing.
[26] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[27] David Silver,et al. Fictitious Self-Play in Extensive-Form Games , 2015, ICML.
[28] Kenneth O. Stanley,et al. Open-Ended Evolution: Perspectives from the OEE Workshop in York , 2016, Artificial Life.
[29] Susan Stepney,et al. Defining and simulating open-ended novelty: requirements, guidelines, and challenges , 2016, Theory in Biosciences.
[30] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[31] M. Baker. Hodge theory in combinatorics , 2017, 1705.07960.
[32] Kenneth O. Stanley,et al. Minimal criterion coevolution: a new approach to open-ended search , 2017, GECCO.
[33] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[34] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.
[35] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[36] Joel Z. Leibo,et al. Human-level performance in first-person multiplayer games with population-based deep reinforcement learning , 2018, ArXiv.
[37] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[38] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[39] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[40] Rui Wang,et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.