暂无分享,去创建一个
Max Jaderberg | Valentin Dalibard | Adam Stooke | Anuj Mahajan | Jakub Sygnowski | Roberta Raileanu | Nathaniel Wong | Maja Trebacz | Nicolas Porcel | Wojciech M. Czarnecki | Charlie Deck | Wojciech Marian Czarnecki | Open-Ended Learning Team | Catarina Barros | Jakob Bauer | Michael Mathieu | Nat McAleese | Nathalie Bradley-Schmieg | Steph Hughes-Fitt | Michaël Mathieu | Max Jaderberg | Valentin Dalibard | Adam Stooke | Jakub Sygnowski | Roberta Raileanu | Nathaniel Wong | Maja Trebacz | Catarina Barros | Nathan McAleese | Anuj Mahajan | Nicolas Porcel | Charlie Deck | Jakob Bauer | N. Bradley-Schmieg | Steph Hughes-Fitt | C. Barros
[1] Bernhard Schölkopf,et al. The Kernel Trick for Distances , 2000, NIPS.
[2] Pierre-Yves Oudeyer,et al. Language-Conditioned Goal Generation: a New Approach to Language Grounding for RL , 2020, ArXiv.
[3] Julian Togelius,et al. Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation , 2018, 1806.10729.
[4] Andrew Zisserman,et al. Kickstarting Deep Reinforcement Learning , 2018, ArXiv.
[5] Razvan Pascanu,et al. Distilling Policy Distillation , 2019, AISTATS.
[6] Max Jaderberg,et al. Open-ended Learning in Symmetric Zero-sum Games , 2019, ICML.
[7] Teuvo Kohonen,et al. Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.
[8] Bernard W. Silverman,et al. Density Estimation for Statistics and Data Analysis , 1987 .
[9] Wojciech Zaremba,et al. Asymmetric self-play for automatic goal discovery in robotic manipulation , 2021, ArXiv.
[10] Quoc V. Le,et al. A graph placement methodology for fast chip design , 2021, Nature.
[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[12] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[13] Jeff Clune,et al. Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft , 2021, ArXiv.
[14] D. Krass,et al. Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..
[15] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[16] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[17] Jacob Schrum,et al. CPPN2GAN: combining compositional pattern producing networks and GANs for large-scale pattern generation , 2020, GECCO.
[18] Vivek S. Borkar,et al. Risk-constrained Markov decision processes , 2010, 49th IEEE Conference on Decision and Control (CDC).
[19] Andrew K. Lampinen,et al. Automated curriculum generation through setter-solver interactions , 2020, ICLR.
[20] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[21] Thore Graepel,et al. Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers , 2021, ICML.
[22] Joel Z. Leibo,et al. OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning , 2020, ICML.
[23] Geraud Nangue Tasse,et al. A Boolean Task Algebra for Reinforcement Learning , 2020, NeurIPS.
[24] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[25] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[26] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[27] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[28] Huchuan Lu,et al. Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[29] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[30] Thore Graepel,et al. Re-evaluating evaluation , 2018, NeurIPS.
[31] Joel Lehman,et al. Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions , 2020, ICML.
[32] Benjamin Rosman,et al. Composing Value Functions in Reinforcement Learning , 2019, ICML.
[33] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[34] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.
[35] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[36] Frank Nielsen. Closed-form information-theoretic divergences for statistical mixtures , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).
[37] Julian Togelius,et al. PCGRL: Procedural Content Generation via Reinforcement Learning , 2020, AAAI 2020.
[38] Pierre-Yves Oudeyer,et al. In Search of the Neural Circuits of Intrinsic Motivation , 2007, Front. Neurosci..
[39] Shie Mannor,et al. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..
[40] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.
[41] Yee Whye Teh,et al. Progress & Compress: A scalable framework for continual learning , 2018, ICML.
[42] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[43] Igor Mordatch,et al. Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.
[44] Joel Z. Leibo,et al. Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research , 2019, ArXiv.
[45] Mohammad Ghavamzadeh,et al. Actor-Critic Algorithms for Risk-Sensitive MDPs , 2013, NIPS.
[46] Julian Togelius,et al. An experiment in automatic game design , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.
[47] Sergey Levine,et al. Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design , 2020, NeurIPS.
[48] C. Koch,et al. Invariant visual representation by single neurons in the human brain , 2005, Nature.
[49] Julian Togelius,et al. Procedural Content Generation in Games , 2016, Computational Synthesis and Creative Systems.
[50] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[51] Brian A. Davey,et al. An Introduction to Lattices and Order , 1989 .
[52] Joshua B. Tenenbaum,et al. Learning with AMIGo: Adversarially Motivated Intrinsic Goals , 2020, ICLR.
[53] Pierre-Yves Oudeyer,et al. Teacher algorithms for curriculum learning of Deep RL in continuously parameterized environments , 2019, CoRL.
[54] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.
[55] Kenneth O. Stanley,et al. Minimal criterion coevolution: a new approach to open-ended search , 2017, GECCO.
[56] Shimon Whiteson,et al. MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.
[57] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[58] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[59] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.
[60] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Jeff Clune,et al. AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence , 2019, ArXiv.
[62] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[63] Avrim Blum,et al. Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.
[64] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[65] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[66] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[67] Edward Grefenstette,et al. Prioritized Level Replay , 2020, ICML.
[68] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[69] Yoshua Bengio,et al. Understanding intermediate layers using linear classifier probes , 2016, ICLR.
[70] Kenneth O. Stanley,et al. Compositional Pattern Producing Networks : A Novel Abstraction of Development , 2007 .
[71] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[72] Kenneth O. Stanley,et al. POET: open-ended coevolution of environments and their optimized solutions , 2019, GECCO.
[73] Rasmus Berg Palm,et al. EvoCraft: A New Challenge for Open-Endedness , 2020, EvoApplications.
[74] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[75] Lei Han,et al. Curriculum-guided Hindsight Experience Replay , 2019, NeurIPS.
[76] Emanuel Todorov,et al. Compositionality of optimal control laws , 2009, NIPS.
[77] Max Jaderberg,et al. Real World Games Look Like Spinning Tops , 2020, NeurIPS.
[78] Pierre-Yves Oudeyer,et al. CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning , 2018, ICML.
[79] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[80] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.
[81] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[82] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[83] Yoav Shoham,et al. If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..
[84] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[85] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[86] Maryam Kamgarpour,et al. Contextual Games: Multi-Agent Learning with Side Information , 2021, NeurIPS.
[87] John Schulman,et al. Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[88] S. P. Lloyd,et al. Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.
[89] H. Francis Song,et al. V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control , 2019, ICLR.
[90] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[91] Allan Jabri,et al. Unsupervised Curricula for Visual Meta-Reinforcement Learning , 2019, NeurIPS.
[92] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.
[93] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.
[94] M. Randic,et al. Resistance distance , 1993 .
[95] Alec Radford,et al. Multimodal Neurons in Artificial Neural Networks , 2021 .
[96] Pierre-Yves Oudeyer,et al. Automatic Curriculum Learning For Deep RL: A Short Survey , 2020, IJCAI.
[97] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[98] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[99] D. Robinson,et al. The topology of the 2x2 games : a new periodic table , 2005 .
[100] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.
[101] Y. Takane,et al. Generalized Inverse Matrices , 2011 .
[102] C. Moorehead. All rights reserved , 1997 .
[103] Junhyuk Oh,et al. Pick Your Battles: Interaction Graphs as Population-Level Objectives for Strategic Diversity , 2021, AAMAS.
[104] P. J. Green,et al. Density Estimation for Statistics and Data Analysis , 1987 .