TOWARDS MINIMAX OPTIMAL REWARD-FREE REIN-
暂无分享,去创建一个
[1] Yu Chen,et al. Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation , 2022, ICML.
[2] A. Krishnamurthy,et al. On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL , 2022, NeurIPS.
[3] Ric De Santi,et al. Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization , 2022, AAAI.
[4] Tie-Yan Liu,et al. Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality , 2022, ICLR.
[5] Alekh Agarwal,et al. Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach , 2022, ICML.
[6] Kevin G. Jamieson,et al. First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach , 2021, ICML.
[7] Pieter Abbeel,et al. Mastering Atari Games with Limited Data , 2021, NeurIPS.
[8] Quanquan Gu,et al. Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation , 2021, NeurIPS.
[9] Liwei Wang,et al. Near-Optimal Reward-Free Exploration for Linear Mixture MDPs with Plug-in Solver , 2021, ICLR.
[10] V. Braverman,et al. Gap-Dependent Unsupervised Exploration for Reinforcement Learning , 2021, AISTATS.
[11] A. Krishnamurthy,et al. Model-free Representation Learning and Exploration in Low-rank MDPs , 2021, ArXiv.
[12] Michael I. Jordan,et al. Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints , 2021, NeurIPS.
[13] Quanquan Gu,et al. Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes , 2020, COLT.
[14] Quanquan Gu,et al. Logarithmic Regret for Reinforcement Learning with Linear Function Approximation , 2020, ICML.
[15] Xiangyang Ji,et al. Nearly Minimax Optimal Reward-free Reinforcement Learning , 2020, ArXiv.
[16] Mykel J. Kochenderfer,et al. Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration , 2020, NeurIPS.
[17] Anders Jonsson,et al. Fast active learning for pure exploration in reinforcement learning , 2020, ICML.
[18] Aaron C. Courville,et al. Data-Efficient Reinforcement Learning with Self-Predictive Representations , 2020, ICLR.
[19] Ruosong Wang,et al. On Reward-Free Reinforcement Learning with Linear Function Approximation , 2020, NeurIPS.
[20] E. Kaufmann,et al. Adaptive Reward-Free Exploration , 2020, ALT.
[21] Mengdi Wang,et al. Model-Based Reinforcement Learning with Value-Targeted Regression , 2020, L4DC.
[22] Lin F. Yang,et al. Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension , 2020, NeurIPS.
[23] Yi Wu,et al. Multi-Task Reinforcement Learning with Soft Modularization , 2020, NeurIPS.
[24] Mykel J. Kochenderfer,et al. Learning Near Optimal Policies with Low Inherent Bellman Error , 2020, ICML.
[25] Akshay Krishnamurthy,et al. Reward-Free Exploration for Reinforcement Learning , 2020, ICML.
[26] Chi Jin,et al. Provably Efficient Exploration in Policy Optimization , 2019, ICML.
[27] Ruosong Wang,et al. Optimism in Reinforcement Learning with Generalized Linear Function Approximation , 2019, ICLR.
[28] Akshay Krishnamurthy,et al. Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning , 2019, ICML.
[29] Alessandro Lazaric,et al. Frequentist Regret Bounds for Randomized Least-Squares Value Iteration , 2019, AISTATS.
[30] Ambuj Tewari,et al. Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles , 2019, AISTATS.
[31] Chelsea Finn,et al. Language as an Abstraction for Hierarchical Deep Reinforcement Learning , 2019, NeurIPS.
[32] Mengdi Wang,et al. Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound , 2019, ICML.
[33] Mengdi Wang,et al. Sample-Optimal Parametric Q-Learning Using Linearly Additive Features , 2019, ICML.
[34] Nan Jiang,et al. Provably efficient RL with Rich Observations via Latent State Decoding , 2019, ICML.
[35] Nan Jiang,et al. Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches , 2018, COLT.
[36] Lihong Li,et al. Policy Certificates: Towards Accountable Reinforcement Learning , 2018, ICML.
[37] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[38] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.
[39] Nan Jiang,et al. On Oracle-Efficient PAC RL with Rich Observations , 2018, NeurIPS.
[40] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[41] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[42] Rémi Munos,et al. Minimax Regret Bounds for Reinforcement Learning , 2017, ICML.
[43] Nan Jiang,et al. Contextual Decision Processes with low Bellman rank are PAC-Learnable , 2016, ICML.
[44] Tor Lattimore,et al. PAC Bounds for Discounted MDPs , 2012, ALT.
[45] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.