暂无分享,去创建一个
[1] Pushmeet Kohli,et al. CompILE: Compositional Imitation Learning and Execution , 2018, ICML.
[2] Ofir Nachum,et al. Provable Representation Learning for Imitation with Contrastive Fourier Features , 2021, NeurIPS.
[3] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[4] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[5] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[6] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[7] Daniel Berend,et al. On the Convergence of the Empirical Distribution , 2012, 1205.6711.
[8] Sergey Levine,et al. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies , 2019, NeurIPS.
[9] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[10] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[11] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[12] Ion Stoica,et al. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations , 2017, CoRL.
[13] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[14] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[15] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[16] Sergey Levine,et al. Learning Latent Plans from Play , 2019, CoRL.
[17] Sanjeev Arora,et al. Provable Representation Learning for Imitation Learning via Bi-level Optimization , 2020, ICML.
[18] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[19] Abhinav Gupta,et al. Discovering Motor Programs by Recomposing Demonstrations , 2020, ICLR.
[20] M. Mehdi Afsar,et al. Reinforcement learning based recommender systems: A survey , 2021, ArXiv.
[21] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[22] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[23] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[24] Sergey Levine,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[25] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[26] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[27] Bo Dai,et al. Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach , 2021, EMNLP.
[28] Misha Denil,et al. Offline Learning from Demonstrations and Unlabeled Experience , 2020, ArXiv.
[29] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[30] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[31] Pieter Abbeel,et al. Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback , 2021, CoRL.
[32] B. Krogh,et al. State aggregation in Markov decision processes , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..
[33] Qiangxing Tian,et al. Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning , 2021, AAAI.
[34] Sergey Levine,et al. OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning , 2021, ICLR.
[35] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[36] Pieter Abbeel,et al. Hierarchical Few-Shot Imitation with Skill Transition Models , 2021, ArXiv.
[37] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[38] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[39] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[40] Abhinav Gupta,et al. Learning Robot Skills with Temporal Variational Inference , 2020, ICML.
[41] Nando de Freitas,et al. Critic Regularized Regression , 2020, NeurIPS.
[42] Sergio Gomez Colmenarejo,et al. RL Unplugged: Benchmarks for Offline Reinforcement Learning , 2020, ArXiv.
[43] Vikash Kumar,et al. Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real , 2019, CoRL.
[44] Nikolai Matni,et al. Closing the Closed-Loop Distribution Shift in Safe Imitation Learning , 2021, ArXiv.
[45] David Warde-Farley,et al. Unsupervised Control Through Non-Parametric Discriminative Rewards , 2018, ICLR.
[46] Sergey Levine,et al. Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.
[47] Joseph J. Lim,et al. Accelerating Reinforcement Learning with Learned Skill Priors , 2020, CoRL.
[48] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[49] Sergey Levine,et al. Parrot: Data-Driven Behavioral Priors for Reinforcement Learning , 2020, ICLR.
[50] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[51] Doina Precup,et al. Using Bisimulation for Policy Transfer in MDPs , 2010, AAAI.
[52] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[53] Rowan McAllister,et al. Learning Invariant Representations for Reinforcement Learning without Reconstruction , 2020, ICLR.
[54] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.