Efficient Planning in a Compact Latent Action Space
暂无分享,去创建一个
Michael Janner | Edward Grefenstette | Tim Rocktaschel | Tianjun Zhang | Yuandong Tian | Zhengyao Jiang | Yueying Li
[1] Kan Ren,et al. Bootstrapped Transformer for Offline Reinforcement Learning , 2022, NeurIPS.
[2] S. Levine,et al. Planning with Diffusion for Flexible Behavior Synthesis , 2022, ICML.
[3] Sergio Gomez Colmenarejo,et al. A Generalist Agent , 2022, Trans. Mach. Learn. Res..
[4] S. Levine,et al. ASE , 2022, ACM Trans. Graph..
[5] Amy Zhang,et al. Online Decision Transformer , 2022, ICML.
[6] Pieter Abbeel,et al. Mastering Atari Games with Limited Data , 2021, NeurIPS.
[7] Robert Dadashi,et al. Continuous Control with Action Quantization from Demonstrations , 2021, ICML.
[8] Sergey Levine,et al. Offline Reinforcement Learning with Implicit Q-Learning , 2021, ICLR.
[9] Michael A. Osborne,et al. Revisiting Design Choices in Offline Model Based Reinforcement Learning , 2021, ICLR.
[10] Jeannette Bohg,et al. Learning latent actions to control assistive robots , 2021, Autonomous Robots.
[11] Dan Klein,et al. Learning Space Partitions for Path Planning , 2021, NeurIPS.
[12] Scott Fujimoto,et al. A Minimalist Approach to Offline Reinforcement Learning , 2021, NeurIPS.
[13] Ali Razavi,et al. Vector Quantized Models for Planning , 2021, ICML.
[14] Sergey Levine,et al. Offline Reinforcement Learning as One Big Sequence Modeling Problem , 2021, NeurIPS.
[15] Pieter Abbeel,et al. Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.
[16] Dorsa Sadigh,et al. Learning Visually Guided Latent Actions for Assistive Teleoperation , 2021, L4DC.
[17] Silvio Savarese,et al. LASER: Learning a Latent Action Space for Efficient Reinforcement Learning , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[18] David Held,et al. PLAS: Latent Action Space for Offline Reinforcement Learning , 2020, CoRL.
[19] Jessica B. Hamrick,et al. On the role of planning in model-based deep reinforcement learning , 2020, ICLR.
[20] Mohammad Norouzi,et al. Mastering Atari with Discrete World Models , 2020, ICLR.
[21] Andrew Gordon Wilson,et al. On the model-based stochastic value gradient for continuous reinforcement learning , 2020, L4DC.
[22] Wei Chen,et al. TrajVAE: A Variational AutoEncoder model for trajectory generation , 2020, Neurocomputing.
[23] Yuandong Tian,et al. Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search , 2020, NeurIPS.
[24] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[25] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[26] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[27] T. Joachims,et al. MOReL : Model-Based Offline Reinforcement Learning , 2020, NeurIPS.
[28] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[29] Juergen Schmidhuber,et al. Reinforcement Learning Upside Down: Don't Predict Rewards - Just Map Them to Actions , 2019, ArXiv.
[30] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[31] D. Fox,et al. IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[32] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.
[33] Takeshi Nishida,et al. Trajectory Prediction with a Conditional Variational Autoencoder , 2019, J. Robotics Mechatronics.
[34] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[35] S. Levine,et al. Learning Latent Plans from Play , 2019, CoRL.
[36] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[37] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[38] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.
[39] Katja Hofmann,et al. Trajectory VAE for multi-modal imitation , 2018 .
[40] Sergey Levine,et al. Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.
[41] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[42] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[43] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[44] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.
[45] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[46] Martin A. Riedmiller,et al. Approximate model-assisted Neural Fitted Q-Iteration , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).
[47] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[48] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.
[49] Michael Fairbank,et al. Reinforcement Learning by Value Gradients , 2008, ArXiv.
[50] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[51] Sven Koenig,et al. Abstraction, Reformulation, and Approximation: 5th International Symposium, SARA 2002, Kananaskis, Alberta, Canada, August 2-4, 2002, Proceedings , 2002 .
[52] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[53] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[54] Hermann Ney,et al. Accelerated DP based search for statistical translation , 1997, EUROSPEECH.
[55] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[56] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[57] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .