Skill-based Model-based Reinforcement Learning
暂无分享,去创建一个
[1] Xiaolong Wang,et al. Temporal Difference Learning for Model Predictive Control , 2022, ICML.
[2] W. Burgard,et al. CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks , 2021, IEEE Robotics and Automation Letters.
[3] Joseph J. Lim,et al. Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization , 2021, CoRL.
[4] S. Levine,et al. Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning , 2021, ICLR.
[5] Ruslan Salakhutdinov,et al. Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives , 2021, NeurIPS.
[6] Oleh Rybkin,et al. Discovering and Achieving Goals via World Models , 2021, NeurIPS.
[7] Li Fei-Fei,et al. Example-Driven Model-Based Reinforcement Learning for Solving Long-Horizon Visuomotor Tasks , 2021, CoRL.
[8] Joseph J. Lim,et al. Demonstration-Guided Reinforcement Learning with Learned Skills , 2021, CoRL.
[9] Joshua B. Tenenbaum,et al. Learning Task Decomposition with Ordered Memory Policy Network , 2021, ICLR.
[10] P. Abbeel,et al. Reset-Free Lifelong Learning with Skill-Space Planning , 2020, ICLR.
[11] Florian Shkurti,et al. Latent Skill Planning for Exploration and Transfer , 2020, ICLR.
[12] Joseph J. Lim,et al. Accelerating Reinforcement Learning with Learned Skill Priors , 2020, CoRL.
[13] Gabriel Dulac-Arnold,et al. Model-Based Offline Planning , 2020, ICLR.
[14] Abhinav Gupta,et al. Learning Robot Skills with Temporal Variational Inference , 2020, ICML.
[15] Pieter Abbeel,et al. Planning to Explore via Self-Supervised World Models , 2020, ICML.
[16] Joseph J. Lim,et al. Learning to Coordinate Manipulation Skills via Skill Behavior Diversification , 2020, ICLR.
[17] Abhinav Gupta,et al. Discovering Motor Programs by Recomposing Demonstrations , 2020, ICLR.
[18] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[19] Li Fei-Fei,et al. Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations , 2020, Robotics: Science and Systems.
[20] Jimmy Ba,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[21] Joseph J. Lim,et al. IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks , 2019, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[22] D. Fox,et al. IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[23] Sergey Levine,et al. Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning , 2019, CoRL.
[24] S. Levine,et al. RoboNet: Large-Scale Multi-Robot Learning , 2019, CoRL.
[25] S. Levine,et al. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? , 2019, ArXiv.
[26] Sergey Levine,et al. Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.
[27] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.
[28] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[29] S. Levine,et al. Learning Latent Plans from Play , 2019, CoRL.
[30] Pushmeet Kohli,et al. CompILE: Compositional Imitation Learning and Execution , 2018, ICML.
[31] Li Fei-Fei,et al. ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation , 2018, CoRL.
[32] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[33] Joseph J. Lim,et al. Composing Complex Skills by Learning Transition Policies , 2018, ICLR.
[34] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[35] Jürgen Schmidhuber,et al. World Models , 2018, ArXiv.
[36] Shimon Whiteson,et al. TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.
[37] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[38] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[39] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[40] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[41] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[42] James M. Rehg,et al. Aggressive driving with model predictive path integral control , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[43] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[44] Evangelos Theodorou,et al. Model Predictive Path Integral Control using Covariance Variable Importance Sampling , 2015, ArXiv.
[45] Ari Weinstein,et al. Model-based hierarchical reinforcement learning and human action control , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[46] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[47] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[48] Stefan Schaal,et al. Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.
[49] Shane Legg,et al. Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.
[50] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[51] Reuven Y. Rubinstein,et al. Optimization of computer simulation models with rare events , 1997 .
[52] W. Hager,et al. and s , 2019, Shallow Water Hydraulics.
[53] TACO: Learning Task Decomposition via Temporal Alignment for Control - , 2018 .
[54] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[55] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .