暂无分享,去创建一个
Jordi Torres | Richard Socher | Xavier Giró | Caiming Xiong | Alexander Trott | Víctor Campos | R. Socher | Alexander R. Trott | Caiming Xiong | Alexander Trott | Jordi Torres | Xavier Giro-i-Nieto | Víctor Campos
[1] David Warde-Farley,et al. Unsupervised Control Through Non-Parametric Discriminative Rewards , 2018, ICLR.
[2] Sergey Levine,et al. Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.
[3] Kenneth O. Stanley,et al. Novelty Search and the Problem with Objectives , 2011 .
[4] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[5] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[7] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.
[8] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[9] Jordi Torres,et al. Mining urban events from the tweet stream through a probabilistic mixture model , 2018, Data Mining and Knowledge Discovery.
[10] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[11] David Barber,et al. The IM algorithm: a variational approach to Information Maximization , 2003, NIPS 2003.
[12] R. Miikkulainen,et al. Learning Behavior Characterizations for Novelty Search , 2016, GECCO.
[13] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[14] Tom Schaul,et al. Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement , 2018, ICML.
[15] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[16] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[17] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[18] Sebastian Scherer,et al. Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution , 2017, ICML.
[19] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[20] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[21] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[22] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[23] Richard Socher,et al. Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards , 2019, NeurIPS.
[24] Kenneth O. Stanley,et al. Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.
[25] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[26] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[27] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[28] Ali Razavi,et al. Data-Efficient Image Recognition with Contrastive Predictive Coding , 2019, ICML.
[29] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[30] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[31] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[32] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[33] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[34] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[35] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[36] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[37] Sergey Levine,et al. Efficient Exploration via State Marginal Matching , 2019, ArXiv.
[38] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[39] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[40] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[41] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[42] Tom Schaul,et al. Universal Successor Features Approximators , 2018, ICLR.
[43] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[44] Sergey Levine,et al. Learning Latent Plans from Play , 2019, CoRL.
[45] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[46] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[47] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[48] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[49] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[50] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[51] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[52] Kenneth O. Stanley,et al. Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.
[53] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[54] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[55] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[56] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[57] Allan Jabri,et al. Unsupervised Curricula for Visual Meta-Reinforcement Learning , 2019, NeurIPS.
[58] Katja Hofmann,et al. The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors , 2019, ArXiv.
[59] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[60] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[61] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[62] Satinder Singh,et al. Self-Imitation Learning , 2018, ICML.
[63] Martin A. Riedmiller,et al. Self-supervised Learning of Image Embedding for Continuous Control , 2019, ArXiv.
[64] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[65] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[66] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[67] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[68] Antoine Cully,et al. Robots that can adapt like animals , 2014, Nature.
[69] Jean-Baptiste Mouret,et al. Illuminating search spaces by mapping elites , 2015, ArXiv.
[70] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[71] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[72] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[73] Richard Socher,et al. Competitive Experience Replay , 2019, ICLR.
[74] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[75] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.