Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents
暂无分享,去创建一个
Zeb Kurth-Nelson | Demis Hassabis | Matthew Botvinick | Alexander Lerchner | David P. Reichert | Malcolm Reynolds | Gavin Buttimore | Neil C. Rabinowitz | Neil Rabinowitz | Mary Cassin | Nicolas Porcel | Loic Matthey | Francis Song | Jane X. Wang | Michael King | Tina Zhu | Charlie Deck | Peter Choy | D. Hassabis | M. Botvinick | Z. Kurth-Nelson | Malcolm Reynolds | L. Matthey | Alexander Lerchner | Mary Cassin | Tina Zhu | Francis Song | Michael King | Nicolas Porcel | Charlie Deck | Peter Choy | Gavin Buttimore
[1] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[2] W. Geisler. Ideal Observer Analysis , 2002 .
[3] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[5] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[6] P. Alam. ‘T’ , 2021, Composites Engineering: An A–Z Guide.
[7] Jane X. Wang,et al. Meta-learning in natural and artificial intelligence , 2020, Current Opinion in Behavioral Sciences.
[8] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[9] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.
[10] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[11] Marek Wydmuch,et al. ViZDoom Competitions: Playing Doom From Pixels , 2018, IEEE Transactions on Games.
[12] S. Levine,et al. Guided Meta-Policy Search , 2019, NeurIPS.
[13] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[14] Andrew Zisserman,et al. Kickstarting Deep Reinforcement Learning , 2018, ArXiv.
[15] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[16] Shane Legg,et al. Meta-trained agents implement Bayes-optimal agents , 2020, NeurIPS.
[17] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[18] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] Yee Whye Teh,et al. Meta reinforcement learning as task inference , 2019, ArXiv.
[20] H. Francis Song,et al. V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control , 2019, ICLR.
[21] Yuval Tassa,et al. dm_control: Software and Tasks for Continuous Control , 2020, Softw. Impacts.
[22] Razvan Pascanu,et al. Distilling Policy Distillation , 2019, AISTATS.
[23] Roozbeh Mottaghi,et al. ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Chelsea Finn,et al. Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices , 2020, ICML.
[25] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[26] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.
[27] Song-Chun Zhu,et al. HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving , 2021, ArXiv.
[28] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[29] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[30] Charles Kemp,et al. How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.
[31] Marwan Mattar,et al. Unity: A General Platform for Intelligent Agents , 2018, ArXiv.
[32] Yee Whye Teh,et al. Meta-learning of Sequential Strategies , 2019, ArXiv.
[33] Sergey Levine,et al. Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design , 2020, NeurIPS.
[34] Edward Grefenstette,et al. The NetHack Learning Environment , 2020, NeurIPS.
[35] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[36] Michael O. Duff,et al. Design for an Optimal Probe , 2003, ICML.
[37] Zeb Kurth-Nelson,et al. Causal Reasoning from Meta-reinforcement Learning , 2019, ArXiv.
[38] Samuel J. Gershman,et al. Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning , 2021, ArXiv.
[39] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[40] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.
[41] David M. Sobel,et al. A theory of causal learning in children: causal maps and Bayes nets. , 2004, Psychological review.
[42] Jane X. Wang,et al. Reinforcement Learning, Fast and Slow , 2019, Trends in Cognitive Sciences.
[43] T. Robbins,et al. Decision Making, Affect, and Learning: Attention and Performance XXIII , 2011 .
[44] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[45] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[46] J. Vanschoren. Meta-Learning , 2018, Automated Machine Learning.
[47] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.
[48] Shimon Whiteson,et al. VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning , 2020, ICLR.
[49] Simon Farrell,et al. Computational Modeling of Cognition and Behavior , 2018 .
[50] Jonathan Baxter,et al. Theoretical Models of Learning to Learn , 1998, Learning to Learn.
[51] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.
[52] John Schulman,et al. Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.
[53] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.
[54] Razvan Pascanu,et al. Stabilizing Transformers for Reinforcement Learning , 2019, ICML.
[55] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[56] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[57] W. J. Studden,et al. Theory Of Optimal Experiments , 1972 .
[58] Pieter Abbeel,et al. The Importance of Sampling inMeta-Reinforcement Learning , 2018, NeurIPS.
[59] Yoshua Bengio,et al. Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[60] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[61] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.
[62] Julian Togelius,et al. Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning , 2019, IJCAI.
[63] Stephen Clark,et al. Grounded Language Learning Fast and Slow , 2021, ICLR.
[64] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[65] Julian Togelius,et al. Procedural Content Generation: From Automatically Generating Game Levels to Increasing Generality in Machine Learning , 2019, ArXiv.
[66] Tor Lattimore,et al. Behaviour Suite for Reinforcement Learning , 2019, ICLR.
[67] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[68] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[69] P. Alam,et al. H , 1887, High Explosives, Propellants, Pyrotechnics.
[70] Peter Stone,et al. Reinforcement learning , 2019, Scholarpedia.
[71] Timothy E. J. Behrens,et al. Learning the value of information in an uncertain world , 2007, Nature Neuroscience.
[72] Jeffrey C Erlich,et al. Decision-making behaviors: weighing ethology, complexity, and sensorimotor compatibility , 2018, Current Opinion in Neurobiology.
[73] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[74] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.