暂无分享,去创建一个
Yee Whye Teh | Razvan Pascanu | Hyeonwoo Noh | Nicolas Heess | Alexandre Galashov | Dhruva Tirumala | Leonard Hasenclever | Arun Ahuja | Greg Wayne | N. Heess | Greg Wayne | Y. Teh | Arun Ahuja | Dhruva Tirumala | Leonard Hasenclever | Razvan Pascanu | Alexandre Galashov | Hyeonwoo Noh
[1] Max Welling,et al. Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.
[2] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[3] Andrew Zisserman,et al. Kickstarting Deep Reinforcement Learning , 2018, ArXiv.
[4] David Barber,et al. An Auxiliary Variational Method , 2004, ICONIP.
[5] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[6] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[7] Yee Whye Teh,et al. Transferring Task Goals via Hierarchical Reinforcement Learning , 2018 .
[8] Oriol Vinyals,et al. Synthesizing Programs for Images using Reinforced Adversarial Learning , 2018, ICML.
[9] Yee Whye Teh,et al. Mix&Match - Agent Curricula for Reinforcement Learning , 2018, ICML.
[10] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[11] Emanuel Todorov,et al. Linearly-solvable Markov decision problems , 2006, NIPS.
[12] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[13] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[14] Yee Whye Teh,et al. Information asymmetry in KL-regularized RL , 2019, ICLR.
[15] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[16] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[17] Joshua B. Tenenbaum,et al. Learning to Share and Hide Intentions using Information Regularization , 2018, NeurIPS.
[18] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[19] N. Roy,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2013 .
[20] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[21] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[22] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[23] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[24] Ion Stoica,et al. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations , 2017, CoRL.
[25] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[26] Henry Zhu,et al. Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[27] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[28] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[29] Naftali Tishby,et al. A Unified Bellman Equation for Causal Information and Value in Markov Decision Processes , 2017, ArXiv.
[30] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[31] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[32] Daniel A. Braun,et al. Thermodynamics as a theory of decision-making with information-processing costs , 2012, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[33] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[34] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[35] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.
[36] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[37] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[38] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[39] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.
[40] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[41] Jordi Grau-Moya,et al. Soft Q-Learning with Mutual-Information Regularization , 2018, ICLR.
[42] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[43] Ryan P. Adams,et al. Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.
[44] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[45] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[46] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[47] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[48] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[49] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[50] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[51] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[52] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[53] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[54] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[55] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Multi-view Observation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[56] Naftali Tishby,et al. Trading Value and Information in MDPs , 2012 .
[57] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[58] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[59] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[60] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[61] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[62] Sergey Levine,et al. InfoBot: Transfer and Exploration via the Information Bottleneck , 2019, ICLR.
[63] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[64] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[65] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[66] Pratik Rane,et al. Self-Critical Sequence Training for Image Captioning , 2018 .
[67] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.