暂无分享,去创建一个
Sergey Levine | Honglak Lee | Ofir Nachum | Shixiang Gu | S. Levine | S. Gu | Honglak Lee | Ofir Nachum
[1] Satinder P. Singh,et al. Transfer via soft homomorphisms , 2009, AAMAS.
[2] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[3] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[4] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[5] Aaron C. Courville,et al. MINE: Mutual Information Neural Estimation , 2018, ArXiv.
[6] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[8] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[9] Balaraman Ravindran,et al. Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.
[10] Michael L. Littman,et al. Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.
[11] Andrew G. Barto,et al. Motor primitive discovery , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[12] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[13] James R. Evans,et al. Aggregation and Disaggregation Techniques and Methodology in Optimization , 1991, Oper. Res..
[14] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[15] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[16] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[17] Benjamin Van Roy. Performance Loss Bounds for Approximate Value Iteration with State Aggregation , 2006, Math. Oper. Res..
[18] Ward Whitt,et al. Approximations of Dynamic Programs, I , 1978, Math. Oper. Res..
[19] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[20] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[21] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[22] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[23] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[24] Charles Blundell,et al. Early Visual Concept Learning with Unsupervised Deep Learning , 2016, ArXiv.
[25] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[26] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[27] Kate Saenko,et al. Hierarchical Actor-Critic , 2017, ArXiv.
[28] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .
[29] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[30] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[31] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[32] Jürgen Schmidhuber,et al. Planning simple trajectories using neural subgoal generators , 1993 .
[33] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[34] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[35] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.