暂无分享,去创建一个
Sergey Levine | Yoshua Bengio | Jonathan Binas | Anirudh Goyal | Shagun Sodhani | Xue Bin Peng | Yoshua Bengio | S. Levine | X. B. Peng | Anirudh Goyal | Shagun Sodhani | Jonathan Binas
[1] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[2] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[4] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[5] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[6] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[9] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[10] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[11] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[14] Stefano Soatto,et al. Information Dropout: learning optimal representations through noise , 2017, ArXiv.
[15] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[16] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[17] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[18] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[19] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[20] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[21] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[22] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[24] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[25] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[26] Glen Berseth,et al. DeepLoco , 2017, ACM Trans. Graph..
[27] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[28] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[29] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[30] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[31] J. Hodgins,et al. Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning , 2017, ACM Trans. Graph..
[32] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[33] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[34] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[35] Bernhard Schölkopf,et al. Learning Independent Causal Mechanisms , 2017, ICML.
[36] Mykel J. Kochenderfer,et al. Model Primitive Hierarchical Lifelong Reinforcement Learning , 2019, AAMAS.
[37] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[38] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.
[39] Nicolas Heess,et al. Hierarchical visuomotor control of humanoids , 2018, ICLR.
[40] Ignacio Cases,et al. Routing Networks and the Challenges of Modular and Compositional Computation , 2019, ArXiv.