Latent Space Policies for Hierarchical Reinforcement Learning
暂无分享,去创建一个
Sergey Levine | Pieter Abbeel | Tuomas Haarnoja | Kristian Hartikainen | S. Levine | P. Abbeel | Tuomas Haarnoja | Kristian Hartikainen
[1] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[2] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.
[3] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.
[4] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[5] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[6] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[7] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[8] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[9] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[10] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[11] Benjamin Rosman,et al. Hierarchy Through Composition with Multitask LMDPs , 2017, ICML.
[12] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[13] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[14] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[15] Stefan Schaal,et al. Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.
[16] Patrick MacAlpine,et al. Overlapping layered learning , 2018, Artif. Intell..
[17] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[18] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[19] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[20] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[21] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[22] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[23] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[24] H. Kappen. Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.
[25] Emanuel Todorov,et al. Linearly-solvable Markov decision problems , 2006, NIPS.
[26] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[27] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[28] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[29] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[30] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[31] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[32] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[33] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[34] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[35] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[36] Gerhard Neumann,et al. Variational Inference for Policy Search in changing situations , 2011, ICML.
[37] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[38] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.