暂无分享,去创建一个
Sergio Gomez Colmenarejo | Matthew W. Hoffman | Nando de Freitas | Fan Yang | Albin Cassirer | Abbas Abdolmaleki | Andrew Cowie | Ziyu Wang | John Aslanides | Bilal Piot | Feryal Behbahani | Feryal M. P. Behbahani | Gabriel Barth-Maron | Caglar Gulcehre | Tom Le Paine | Serkan Cabi | Bobak Shahriari | Kate Baumli | Matt Hoffman | Sergio G'omez Colmenarejo | Sarah Henderson | Tamara Norman | Alex Novikov | T. Paine | Caglar Gulcehre | Ziyun Wang | Gabriel Barth-Maron | A. Abdolmaleki | Bilal Piot | N. D. Freitas | Serkan Cabi | J. Aslanides | Albin Cassirer | Bobak Shahriari | Alexander Novikov | Tamara Norman | Fan Yang | Kate Baumli | Sarah Henderson | A. Cowie
[1] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[2] Donald Michie,et al. Knowledge, Learning and Machine Intelligence , 1993 .
[3] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[4] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[5] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[6] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[7] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Shane Legg,et al. Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.
[10] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[11] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[12] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[13] Stuart J. Russell,et al. Rationality and Intelligence: A Brief Update , 2013, PT-AI.
[14] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[15] Matthieu Geist,et al. Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.
[16] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[17] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[18] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[21] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[22] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[23] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[24] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[25] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[26] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[27] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[28] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[29] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[30] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[31] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[32] Tom Schaul,et al. Learning from Demonstrations for Real World Reinforcement Learning , 2017, ArXiv.
[33] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[34] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[35] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[36] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[37] Silvio Savarese,et al. SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark , 2018, CoRL.
[38] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[39] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[40] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[41] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[42] Xiaohui Ye,et al. Horizon: Facebook's Open Source Applied Reinforcement Learning Platform , 2018, ArXiv.
[43] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[44] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[45] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[46] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[47] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[48] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[49] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[50] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[51] Michael I. Jordan,et al. Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.
[52] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[53] Joelle Pineau,et al. Benchmarking Batch Deep Reinforcement Learning Algorithms , 2019, ArXiv.
[54] Matteo Hessel,et al. When to use parametric models in reinforcement learning? , 2019, NeurIPS.
[55] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[56] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[57] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[58] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[59] Edward Grefenstette,et al. TorchBeast: A PyTorch Platform for Distributed RL , 2019, ArXiv.
[60] Nando de Freitas,et al. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.
[61] Tor Lattimore,et al. Behaviour Suite for Reinforcement Learning , 2019, ICLR.
[62] Rishabh Agarwal,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2019, ICML.
[63] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.
[64] Oleg O. Sushkov,et al. Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.
[65] Lianlong Wu,et al. Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence , 2019, AAAI.
[66] Raphaël Marinier,et al. SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference , 2019, ICLR.
[67] Kenneth O. Stanley,et al. Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods , 2020, ArXiv.
[68] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[69] Joelle Pineau,et al. Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program) , 2020, J. Mach. Learn. Res..