暂无分享,去创建一个
Sergey Levine | Ruslan Salakhutdinov | Benjamin Eysenbach | S. Levine | R. Salakhutdinov | Benjamin Eysenbach
[1] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[2] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[3] Sergey Levine,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[4] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[5] Anca D. Dragan,et al. SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards , 2019, ICLR.
[6] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[7] Sergey Levine,et al. Few-Shot Goal Inference for Visuomotor Learning and Planning , 2018, CoRL.
[8] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[9] Tucker Hermans,et al. Multi-Fingered Grasp Planning via Inference in Deep Neural Networks , 2020, ArXiv.
[10] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[11] Tucker Hermans,et al. Multifingered Grasp Planning via Inference in Deep Neural Networks: Outperforming Sampling by Learning Differentiable Models , 2020, IEEE Robotics & Automation Magazine.
[12] Sergey Levine,et al. C-Learning: Learning to Achieve Goals via Recursive Classification , 2020, ICLR.
[13] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[14] Sergey Levine,et al. Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition , 2018, NeurIPS.
[15] Ilya Kostrikov,et al. Imitation Learning via Off-Policy Distribution Matching , 2019, ICLR.
[16] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[17] Sergey Levine,et al. DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[18] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[19] Misha Denil,et al. Offline Learning from Demonstrations and Unlabeled Experience , 2020, ArXiv.
[20] Misha Denil,et al. Task-Relevant Adversarial Imitation Learning , 2019, CoRL.
[21] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[22] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[23] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[24] Nando de Freitas,et al. Semi-supervised reward learning for offline reinforcement learning , 2020, ArXiv.
[25] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[26] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .
[27] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[28] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[29] Oleg O. Sushkov,et al. A Practical Approach to Insertion with Variable Socket Position Using Deep Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[30] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[31] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[32] Sergey Levine,et al. End-to-End Robotic Reinforcement Learning without Reward Engineering , 2019, Robotics: Science and Systems.
[33] Sergey Levine,et al. Off-Policy Evaluation via Off-Policy Classification , 2019, NeurIPS.
[34] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[35] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[36] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[37] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[38] Xingyu Lin,et al. Reinforcement Learning without Ground-Truth State , 2019, ArXiv.
[39] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[40] Markus Wulfmeier,et al. Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.
[41] Andrew Owens,et al. The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes? , 2017, CoRL.
[42] Charles Elkan,et al. Learning classifiers from only positive and unlabeled data , 2008, KDD.
[43] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[44] Misha Denil,et al. Positive-Unlabeled Reward Learning , 2019, CoRL.