Behavior Prior Representation learning for Offline Reinforcement Learning
暂无分享,去创建一个
[1] Michael A. Osborne,et al. Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations , 2022, ArXiv.
[2] Alekh Agarwal,et al. Provable Benefits of Representational Transfer in Reinforcement Learning , 2022, COLT.
[3] Stuart J. Russell,et al. An Empirical Investigation of Representation Learning for Imitation , 2022, NeurIPS Datasets and Benchmarks.
[4] Mark Rowland,et al. Understanding and Preventing Capacity Loss in Reinforcement Learning , 2022, ICLR.
[5] Marlos C. Machado,et al. Investigating the Properties of Neural Network Representations in Reinforcement Learning , 2022, ArXiv.
[6] Adam M. Oberman,et al. On the Generalization of Representations in Reinforcement Learning , 2022, AISTATS.
[7] Xin Li,et al. SimSR: Simple Distance-based State Representation for Deep Reinforcement Learning , 2021, AAAI.
[8] Sergey Levine,et al. DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization , 2021, ICLR.
[9] Dylan J. Foster,et al. Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation , 2021, COLT.
[10] Hyun Oh Song,et al. Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble , 2021, NeurIPS.
[11] Viktor K Prasanna,et al. BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning , 2021, ACML.
[12] Marc G. Bellemare,et al. Deep Reinforcement Learning at the Edge of the Statistical Precipice , 2021, NeurIPS.
[13] Csaba Szepesvari,et al. The Curse of Passive Data Collection in Batch Reinforcement Learning , 2021, AISTATS.
[14] Pieter Abbeel,et al. Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL , 2021, ArXiv.
[15] Scott Fujimoto,et al. A Minimalist Approach to Offline Reinforcement Learning , 2021, NeurIPS.
[16] Prakash Panangaden,et al. MICo: Improved representations via sampling-based state similarity for Markov decision processes , 2021, NeurIPS.
[17] Romain Laroche,et al. Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs , 2021, NeurIPS.
[18] Ofir Nachum,et al. Provable Representation Learning for Imitation with Contrastive Fourier Features , 2021, NeurIPS.
[19] Stuart J. Russell,et al. Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism , 2021, IEEE Transactions on Information Theory.
[20] Ilya Kostrikov,et al. Offline Reinforcement Learning with Fisher Divergence Critic Regularization , 2021, ICML.
[21] P. Abbeel,et al. Behavior From the Void: Unsupervised Active Pre-Training , 2021, NeurIPS.
[22] Alessandro Lazaric,et al. Reinforcement Learning with Prototypical Representations , 2021, ICML.
[23] Sergey Levine,et al. COMBO: Conservative Offline Model-Based Policy Optimization , 2021, NeurIPS.
[24] A. Krishnamurthy,et al. Model-free Representation Learning and Exploration in Low-rank MDPs , 2021, ArXiv.
[25] Ofir Nachum,et al. Representation Matters: Offline Pretraining for Sequential Decision Making , 2021, ICML.
[26] Zhuoran Yang,et al. Is Pessimism Provably Efficient for Offline RL? , 2020, ICML.
[27] Sergey Levine,et al. Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning , 2020, ICLR.
[28] Pieter Abbeel,et al. Decoupling Representation Learning from Reinforcement Learning , 2020, ICML.
[29] S. Levine,et al. Learning Invariant Representations for Reinforcement Learning without Reconstruction , 2020, ICLR.
[30] Pierre H. Richemond,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[31] R Devon Hjelm,et al. Deep Reinforcement and InfoMax Learning , 2020, NeurIPS.
[32] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[33] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[34] Phillip Isola,et al. Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere , 2020, ICML.
[35] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[36] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[37] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[38] Sanjeev Arora,et al. Provable Representation Learning for Imitation Learning via Bi-level Optimization , 2020, ICML.
[39] Kavosh Asadi,et al. Learning State Abstractions for Transfer in Continuous Control , 2020, ArXiv.
[40] Akshay Krishnamurthy,et al. Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning , 2019, ICML.
[41] Vinicius G. Goecks,et al. Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Sparse Reward Environments , 2019, AAMAS.
[42] Joelle Pineau,et al. Improving Sample Efficiency in Model-Free Reinforcement Learning from Images , 2019, AAAI.
[43] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[44] Romain Laroche,et al. Safe Policy Improvement with Soft Baseline Bootstrapping , 2019, ECML/PKDD.
[45] Sergey Levine,et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.
[46] Alexander Carballo,et al. A Survey of Autonomous Driving: Common Practices and Emerging Technologies , 2019, IEEE Access.
[47] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[48] Fredrik D. Johansson,et al. Guidelines for reinforcement learning in healthcare , 2019, Nature Medicine.
[49] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[50] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[51] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[52] Lu Wang,et al. Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation , 2018, KDD.
[53] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[54] Romain Laroche,et al. Safe Policy Improvement with Baseline Bootstrapping , 2017, ICML.
[55] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[56] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[57] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[58] Trevor Darrell,et al. Loss is its own Reward: Self-Supervision for Reinforcement Learning , 2016, ICLR.
[59] Marek Petrik,et al. Safe Policy Improvement by Minimizing Robust Baseline Regret , 2016, NIPS.
[60] Balaraman Ravindran,et al. Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks , 2016, ArXiv.
[61] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.
[62] Witold Pedrycz,et al. A Clustering-Based Graph Laplacian Framework for Value Function Approximation in Reinforcement Learning , 2014, IEEE Transactions on Cybernetics.
[63] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[64] Doina Precup,et al. On-the-Fly Algorithms for Bisimulation Metrics , 2012, 2012 Ninth International Conference on Quantitative Evaluation of Systems.
[65] David P. Woodruff,et al. Fast approximation of matrix coherence and statistical leverage , 2011, ICML.
[66] Ameet Talwalkar,et al. Can matrix coherence be efficiently and accurately estimated? , 2011, AISTATS.
[67] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[68] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..
[69] Doina Precup,et al. Methods for Computing State Similarity in Markov Decision Processes , 2006, UAI.
[70] S. Muthukrishnan,et al. Sampling algorithms for l2 regression and applications , 2006, SODA '06.
[71] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.
[72] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[73] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[74] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[75] Sepp Hochreiter,et al. Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning , 2021, ArXiv.
[76] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.