Provable Benefit of Multitask Representation Learning in Reinforcement Learning
暂无分享,去创建一个
[1] S. Du,et al. Provable General Function Class Representation Learning in Multitask Bandits and MDPs , 2022, NeurIPS.
[2] Alekh Agarwal,et al. Provable Benefits of Representational Transfer in Reinforcement Learning , 2022, COLT.
[3] Yu-Xiang Wang,et al. Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism , 2022, ICLR.
[4] M. J. Azizi,et al. Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms , 2022, ArXiv.
[5] Dylan R. Ashley,et al. All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL , 2022, ArXiv.
[6] M. Pontil,et al. Multi-task Representation Learning with Stochastic Linear Bandits , 2022, AISTATS.
[7] Alekh Agarwal,et al. Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach , 2022, ICML.
[8] Aldo Pacchiano,et al. Meta Learning MDPs with Linear Transition Models , 2022, AISTATS.
[9] Samet Oymak,et al. Non-Stationary Representation Learning in Sequential Linear Bandits , 2022, IEEE Open Journal of Control Systems.
[10] Yu-Xiang Wang,et al. Towards Instance-Optimal Offline Reinforcement Learning with Pessimism , 2021, NeurIPS.
[11] Wen Sun,et al. Representation Learning for Online and Offline RL in Low-rank MDPs , 2021, ICLR.
[12] Martin J. Wainwright,et al. Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning , 2021, NeurIPS.
[13] Simon S. Du,et al. On the Power of Multitask Representation Learning in Linear MDP , 2021, ArXiv.
[14] Alekh Agarwal,et al. Bellman-consistent Pessimism for Offline Reinforcement Learning , 2021, NeurIPS.
[15] Caiming Xiong,et al. Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning , 2021, NeurIPS.
[16] Yu-Xiang Wang,et al. Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings , 2021, NeurIPS.
[17] S. Du,et al. Nearly Horizon-Free Offline Reinforcement Learning , 2021, NeurIPS.
[18] Shachar Lovett,et al. Bilinear Classes: A Structural Framework for Provable Generalization in RL , 2021, ICML.
[19] A. Krishnamurthy,et al. Model-free Representation Learning and Exploration in Low-rank MDPs , 2021, ArXiv.
[20] Joelle Pineau,et al. Multi-Task Reinforcement Learning with Context-based Representations , 2021, ICML.
[21] Xiaoyu Chen,et al. Near-optimal Representation Learning for Linear Bandits and Linear RL , 2021, ICML.
[22] Chi Jin,et al. Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms , 2021, NeurIPS.
[23] Zhuoran Yang,et al. Is Pessimism Provably Efficient for Offline RL? , 2020, ICML.
[24] Andrea Zanette,et al. Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL , 2020, ICML.
[25] Ruosong Wang,et al. What are the Statistical Limits of Offline RL with Linear Function Approximation? , 2020, ICLR.
[26] Jason D. Lee,et al. Impact of Representation Learning in Linear Bandits , 2020, ICLR.
[27] Mykel J. Kochenderfer,et al. Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration , 2020, NeurIPS.
[28] Gergely Neu,et al. A Unifying View of Optimism in Episodic Reinforcement Learning , 2020, NeurIPS.
[29] S. Kakade,et al. FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs , 2020, NeurIPS.
[30] Andrea Bonarini,et al. Sharing Knowledge in Multi-Task Deep Reinforcement Learning , 2020, ICLR.
[31] Mykel J. Kochenderfer,et al. Learning Near Optimal Policies with Low Inherent Bellman Error , 2020, ICML.
[32] Michael I. Jordan,et al. Provable Meta-Learning of Linear Representations , 2020, ICML.
[33] Sanjeev Arora,et al. Provable Representation Learning for Imitation Learning via Bi-level Optimization , 2020, ICML.
[34] Sham M. Kakade,et al. Few-Shot Learning via Learning the Representation, Provably , 2020, ICLR.
[35] Weihao Kong,et al. Meta-learning for mixed linear regression , 2020, ICML.
[36] Chi Jin,et al. Provably Efficient Exploration in Policy Optimization , 2019, ICML.
[37] Ruosong Wang,et al. Optimism in Reinforcement Learning with Generalized Linear Function Approximation , 2019, ICLR.
[38] Akshay Krishnamurthy,et al. Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning , 2019, ICML.
[39] Michael I. Jordan,et al. Provably Efficient Reinforcement Learning with Linear Function Approximation , 2019, COLT.
[40] Lin F. Yang,et al. Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound , 2019, ICML.
[41] Nan Jiang,et al. Provably efficient RL with Rich Observations via Latent State Decoding , 2019, ICML.
[42] J. Langford,et al. Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches , 2018, COLT.
[43] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.
[44] Xian Wu,et al. Variance reduced value iteration and faster algorithms for solving Markov decision processes , 2017, SODA.
[45] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[46] Ürün Dogan,et al. Multi-Task Learning for Contextual Bandits , 2017, NIPS.
[47] Rémi Munos,et al. Minimax Regret Bounds for Reinforcement Learning , 2017, ICML.
[48] Tor Lattimore,et al. Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning , 2017, NIPS.
[49] Nan Jiang,et al. Contextual Decision Processes with low Bellman rank are PAC-Learnable , 2016, ICML.
[50] Massimiliano Pontil,et al. The Benefit of Multitask Representation Learning , 2015, J. Mach. Learn. Res..
[51] Daniele Calandriello,et al. Sparse multi-task reinforcement learning , 2014, Intelligenza Artificiale.
[52] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[53] Lihong Li,et al. Sample Complexity of Multi-task Reinforcement Learning , 2013, UAI.
[54] Tor Lattimore,et al. PAC Bounds for Discounted MDPs , 2012, ALT.
[55] H. Kappen,et al. Speedy Q-Learning , 2011, NIPS.
[56] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[57] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[58] Sven Koenig,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1992, AAAI.
[59] Massimiliano Pontil,et al. Multi-task and meta-learning with sparse linear bandits , 2021, UAI.