Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
暂无分享,去创建一个
Michael A. Osborne | Philip J. Ball | Tim G. J. Rudner | Y. Teh | Jack Parker-Holder | M. Osborne | Cong Lu
[1] Vincent Michalski,et al. Learning Robust Dynamics through Variational Sparse Gating , 2022, NeurIPS.
[2] M. P. Kumar,et al. In Defense of the Unitary Scalarization for Deep Multi-Task Learning , 2022, NeurIPS.
[3] Sergey Levine,et al. Offline Reinforcement Learning with Implicit Q-Learning , 2021, ICLR.
[4] Michael A. Osborne,et al. Revisiting Design Choices in Offline Model Based Reinforcement Learning , 2021, ICLR.
[5] Jonathan Tompson,et al. Implicit Behavioral Cloning , 2021, CoRL.
[6] Silvio Savarese,et al. What Matters in Learning from Offline Human Demonstrations for Robot Manipulation , 2021, CoRL.
[7] Alessandro Lazaric,et al. Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning , 2021, ICLR.
[8] Stefano Ermon,et al. Temporal Predictive Coding For Model-Based Planning In Latent Space , 2021, ICML.
[9] Scott Fujimoto,et al. A Minimalist Approach to Offline Reinforcement Learning , 2021, NeurIPS.
[10] Sergey Levine,et al. Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills , 2021, ICML.
[11] Stephen Roberts,et al. Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment , 2021, ICML.
[12] Rob Fergus,et al. Decoupling Value and Policy for Generalization in Reinforcement Learning , 2021, ICML.
[13] Sergey Levine,et al. COMBO: Conservative Offline Model-Based Policy Optimization , 2021, NeurIPS.
[14] Rico Jonschkowski,et al. The Distracting Control Suite - A Challenging Benchmark for Reinforcement Learning from Pixels , 2021, ArXiv.
[15] Chelsea Finn,et al. Offline Reinforcement Learning from Images with Latent Space Models , 2020, L4DC.
[16] Mohammad Norouzi,et al. Mastering Atari with Discrete World Models , 2020, ICLR.
[17] Gabriel Dulac-Arnold,et al. Model-Based Offline Planning , 2020, ICLR.
[18] T. Taniguchi,et al. Dreaming: Model-based Reinforcement Learning by Latent Imagination without Reconstruction , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[19] Matthias Bethge,et al. Improving robustness against common corruptions by covariate shift adaptation , 2020, NeurIPS.
[20] Yuval Tassa,et al. dm_control: Software and Tasks for Continuous Control , 2020, Softw. Impacts.
[21] Jaime Fern'andez del R'io,et al. Array programming with NumPy , 2020, Nature.
[22] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[23] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[24] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[25] Pieter Abbeel,et al. Planning to Explore via Self-Supervised World Models , 2020, ICML.
[26] T. Joachims,et al. MOReL : Model-Based Offline Reinforcement Learning , 2020, NeurIPS.
[27] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[28] P. Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[29] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[30] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[31] Xingyou Song,et al. Observational Overfitting in Reinforcement Learning , 2019, ICLR.
[32] Jimmy Ba,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[33] Oleg O. Sushkov,et al. Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.
[34] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[35] Rishabh Agarwal,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2019, ICML.
[36] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[37] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[38] Noah A. Smith,et al. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.
[39] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[40] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[41] Yee Whye Teh,et al. Disentangling Disentanglement in Variational Autoencoders , 2018, ICML.
[42] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[43] Yiting Xie,et al. Pre-training on Grayscale ImageNet Improves Medical Image Classification , 2018, ECCV Workshops.
[44] David Janz,et al. Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[45] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[46] Trevor Darrell,et al. BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Guillaume Desjardins,et al. Understanding disentangling in β-VAE , 2018, ArXiv.
[48] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[49] David Filliat,et al. State Representation Learning for Control: An Overview , 2018, Neural Networks.
[50] Finale Doshi-Velez,et al. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes , 2017, AAAI.
[51] Paul Newman,et al. 1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..
[52] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[53] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[54] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[55] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[56] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[57] Ivan Bratko,et al. Behavioural Cloning: Phenomena, Results and Problems , 1995 .
[58] S. Levine,et al. Should I Run Offline Reinforcement Learning or Behavioral Cloning? , 2022, ICLR.
[59] Joelle Pineau,et al. Learning Robust State Abstractions for Hidden-Parameter Block MDPs , 2021, ICLR.
[60] Y. Gal,et al. VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning , 2021, J. Mach. Learn. Res..
[61] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[62] Peter Stone,et al. Reinforcement learning , 2019, Scholarpedia.
[63] Claude Sammut,et al. A Framework for Behavioural Cloning , 1995, Machine Intelligence 15.
[64] A. Weigend,et al. Estimating the mean and variance of the target probability distribution , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).