Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation
暂无分享,去创建一个
Yihan Du | Yu Chen | Pihe Hu | Longbo Huang | Si-Yi Wang | De-hui Wu
[1] Quanquan Gu,et al. Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency , 2023, COLT.
[2] Wen Sun,et al. Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR , 2023, ICML.
[3] Xuefeng Gao,et al. Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents , 2023, ICML.
[4] Quanquan Gu,et al. Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes , 2022, ICML.
[5] Alekh Agarwal,et al. VOQL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation , 2022, ArXiv.
[6] Wei Xu,et al. Regret Bounds for Risk-Sensitive Reinforcement Learning , 2022, NeurIPS.
[7] Yihan Du,et al. Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path , 2022, ICLR.
[8] Lei Xu,et al. DeepTrader: A Deep Reinforcement Learning Approach for Risk-Return Balanced Portfolio Management with Market Conditions Embedding , 2021, AAAI.
[9] Dylan J. Foster,et al. Understanding the Eluder Dimension , 2021, NeurIPS.
[10] Zihan Zhang,et al. Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP , 2021, NeurIPS.
[11] Michael I. Jordan,et al. Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints , 2021, NeurIPS.
[12] Quanquan Gu,et al. Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes , 2020, COLT.
[13] Giuseppe De Pietro,et al. Reinforcement learning for intelligent healthcare applications: A survey , 2020, Artif. Intell. Medicine.
[14] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[15] Quanquan Gu,et al. Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping , 2020, ICML.
[16] Mengdi Wang,et al. Model-Based Reinforcement Learning with Value-Targeted Regression , 2020, L4DC.
[17] Lin F. Yang,et al. Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension , 2020, NeurIPS.
[18] Maria Grazia Speranza,et al. Conditional value-at-risk beyond finance: a survey , 2020, Int. Trans. Oper. Res..
[19] Jingliang Duan,et al. Safe Reinforcement Learning for Autonomous Vehicles through Parallel Constrained Policy Optimization* , 2020, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC).
[20] Ambuj Tewari,et al. Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles , 2019, AISTATS.
[21] Silvestr Stanko,et al. Risk-averse Distributional Reinforcement Learning: A CVaR Optimization Approach , 2019, IJCCI.
[22] Insoon Yang,et al. Risk-Aware Motion Planning and Control Using CVaR-Constrained Optimization , 2019, IEEE Robotics and Automation Letters.
[23] Mengdi Wang,et al. Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound , 2019, ICML.
[24] David Isele,et al. Safe Reinforcement Learning on Autonomous Vehicles , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[25] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[26] Etienne Perot,et al. Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.
[27] Huan Xu,et al. Approximate Value Iteration for Risk-Aware Markov Decision Processes , 2017, IEEE Transactions on Automatic Control.
[28] Shie Mannor,et al. Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach , 2015, NIPS.
[29] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[30] Benjamin Van Roy,et al. Model-based Reinforcement Learning and the Eluder Dimension , 2014, NIPS.
[31] Yi Zhang,et al. Markov decision processes with iterated coherent risk measures , 2014, Int. J. Control.
[32] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[33] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[34] Jerzy A. Filar,et al. Time Consistent Dynamic Risk Measures , 2006, Math. Methods Oper. Res..
[35] K. H. Low,et al. Risk-Aware Reinforcement Learning with Coherent Risk Measures and Non-linear Function Approximation , 2023, ICLR.
[36] Zhuoran Yang,et al. Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach , 2021, ICML.
[37] Jonathan Theodor Ott,et al. A Markov Decision Model for a Surveillance Application and Risk-Sensitive Markov Decision Processes , 2010 .
[38] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[39] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .
[40] J. Hull. Options, Futures, and Other Derivatives , 1989 .