SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration
暂无分享,去创建一个
Martin A. Riedmiller | N. Heess | Roland Hafner | T. Lampe | Dhruva Tirumala | Markus Wulfmeier | A. Abdolmaleki | Michael Neunert | Tim Hertweck | Tuomas Haarnoja | Fereshteh Sadeghi | Dushyant Rao | G. Vezzani | Jan Humplik | C. Fantacci | Ben Moran
[1] Michael Milford,et al. Bayesian controller fusion: Leveraging control priors in deep reinforcement learning for robotics , 2021, Int. J. Robotics Res..
[2] N. Heess,et al. NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).
[3] Pierre-Luc Bacon,et al. The Primacy Bias in Deep Reinforcement Learning , 2022, ICML.
[4] Matthew W. Hoffman,et al. Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach , 2022, ArXiv.
[5] R. Hadsell,et al. Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors , 2022, ArXiv.
[6] R. Hadsell,et al. Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies , 2021, ICLR.
[7] Yuval Tassa,et al. Evaluating model-based planning and planner amortization for continuous control , 2021, ICLR.
[8] Yuval Tassa,et al. From Motor Control to Team Play in Simulated Humanoid Football , 2021, Sci. Robotics.
[9] Martin A. Riedmiller,et al. Collect & Infer - a fresh look at data-efficient Reinforcement Learning , 2021, CoRL.
[10] Sandy H. Huang,et al. On Multi-objective Policy Optimization as a Tool for Reinforcement Learning , 2021, ArXiv.
[11] Jan Peters,et al. SKID RAW: Skill Discovery From Raw Trajectories , 2021, IEEE Robotics and Automation Letters.
[12] Charles Blundell,et al. Beyond Fine-Tuning: Transferring Behavior in Reinforcement Learning , 2021, 2102.13515.
[13] W. Xu,et al. Hierarchical Reinforcement Learning By Discovering Intrinsic Options , 2021, ICLR.
[14] Sergey Levine,et al. Parrot: Data-Driven Behavioral Priors for Reinforcement Learning , 2020, ICLR.
[15] S. Levine,et al. OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning , 2020, ICLR.
[16] Martin A. Riedmiller,et al. Data-efficient Hindsight Off-policy Option Learning , 2020, ICML.
[17] Shimon Whiteson,et al. Transient Non-stationarity and Generalisation in Deep Reinforcement Learning , 2020, ICLR.
[18] Yee Whye Teh,et al. Behavior Priors for Efficient Reinforcement Learning , 2020, J. Mach. Learn. Res..
[19] Joseph J. Lim,et al. Accelerating Reinforcement Learning with Learned Skill Priors , 2020, CoRL.
[20] N. Heess,et al. Learning Dexterous Manipulation from Suboptimal Experts , 2020, CoRL.
[21] N. Heess,et al. Importance Weighted Policy Learning and Adaptation , 2020, 2009.04875.
[22] Martin A. Riedmiller,et al. Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion , 2020, CoRL.
[23] Nando de Freitas,et al. Critic Regularized Regression , 2020, NeurIPS.
[24] Abhinav Gupta,et al. Discovering Motor Programs by Recomposing Demonstrations , 2020, ICLR.
[25] Martin A. Riedmiller,et al. Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning , 2020, ICLR.
[26] Ryan P. Adams,et al. On Warm-Starting Neural Network Training , 2019, NeurIPS.
[27] Martin A. Riedmiller,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, Robotics: Science and Systems.
[28] Pieter Abbeel,et al. Sub-policy Adaptation for Hierarchical Reinforcement Learning , 2019, ICLR.
[29] Martin A. Riedmiller,et al. Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics , 2020, CoRL.
[30] Sergey Levine,et al. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning , 2019, ArXiv.
[31] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[32] S. Levine,et al. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? , 2019, ArXiv.
[33] Natasha Jaques,et al. Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog , 2019, ArXiv.
[34] Sergey Levine,et al. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies , 2019, NeurIPS.
[35] Yee Whye Teh,et al. Information asymmetry in KL-regularized RL , 2019, ICLR.
[36] Yee Whye Teh,et al. Exploiting Hierarchy for Learning and Transfer in KL-regularized RL , 2019, ArXiv.
[37] S. Levine,et al. Learning Latent Plans from Play , 2019, CoRL.
[38] Pushmeet Kohli,et al. CompILE: Compositional Imitation Learning and Execution , 2018, ICML.
[39] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.
[40] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[41] Jan Peters,et al. Using probabilistic movement primitives in robotics , 2017, Autonomous Robots.
[42] Shimon Whiteson,et al. TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.
[43] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[44] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[45] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[46] Jan Peters,et al. Learning movement primitive libraries through probabilistic segmentation , 2017, Int. J. Robotics Res..
[47] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[48] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[49] Reinhard Klein,et al. Efficient Unsupervised Temporal Segmentation of Motion Data , 2015, IEEE Transactions on Multimedia.
[50] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[51] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[52] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[53] Pravesh Ranchod,et al. Nonparametric Bayesian reward segmentation for skill discovery using inverse reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[54] Jan Peters,et al. Probabilistic segmentation applied to an assembly task , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).
[55] Jan Peters,et al. Probabilistic Movement Primitives , 2013, NIPS.
[56] Jun Nakanishi,et al. Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.
[57] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[58] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[59] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[60] Scott Niekum,et al. Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery , 2011, Lifelong Learning.
[61] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[62] Markus Wulfmeier,et al. Strength Through Diversity: Robust Behavior Learning via Mixture Policies , 2010 .
[63] Sebastian O. H. Madgwick,et al. An efficient orientation filter for inertial and inertial / magnetic sensor arrays , 2010 .
[64] Jude W. Shavlik,et al. Relational Macros for Transfer in Reinforcement Learning , 2007, ILP.
[65] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.
[66] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[67] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[68] Daniel S. Bernstein,et al. Reusing Old Policies to Accelerate Learning on New MDPs , 1999 .
[69] Michael H. Bowling,et al. Reusing Learned Policies Between Similar Problems , 1998 .
[70] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.