Distributional Reinforcement Learning with Monotonic Splines
暂无分享,去创建一个
[1] Fan Zhou,et al. Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning , 2021, IJCAI.
[2] A. Aldo Faisal,et al. Bayesian Distributional Policy Gradients , 2021, AAAI.
[3] Svetha Venkatesh,et al. Distributional Reinforcement Learning via Moment Matching , 2020, AAAI.
[4] Dmitry Vetrov,et al. Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics , 2020, ICML.
[5] C. Leckie,et al. Invertible Generative Modeling using Linear Rational Splines , 2020, AISTATS.
[6] Yongxin Chen,et al. Sample-based Distributional Policy Gradient , 2020, L4DC.
[7] Tie-Yan Liu,et al. Fully Parameterized Quantile Function for Distributional Reinforcement Learning , 2019, NeurIPS.
[8] Iain Murray,et al. Neural Spline Flows , 2019, NeurIPS.
[9] Shie Mannor,et al. Distributional Policy Optimization: An Alternative Approach for Continuous Control , 2019, NeurIPS.
[10] John D. Martin,et al. Stochastically Dominant Distributional Reinforcement Learning , 2019, ICML.
[11] Bo Liu,et al. QUOTA: The Quantile Option Architecture for Reinforcement Learning , 2018, AAAI.
[12] Alan Fern,et al. Learning Finite State Representations of Recurrent Policy Networks , 2018, ICLR.
[13] Aviv Tamar,et al. Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN , 2018, ArXiv.
[14] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[15] Yee Whye Teh,et al. An Analysis of Categorical Distributional Reinforcement Learning , 2018, AISTATS.
[16] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[17] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[18] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[19] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[20] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[21] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[22] Marc G. Bellemare,et al. The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning , 2017, ICLR.
[23] J. Schulman,et al. OpenAI Gym , 2016, ArXiv.
[24] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[26] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..
[27] Masashi Sugiyama,et al. Parametric Return Density Estimation for Reinforcement Learning , 2010, UAI.
[28] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[29] J. Gregory,et al. Piecewise rational quadratic interpola-tion to monotonic data , 1982 .
[30] Xingdong Feng,et al. Non-Crossing Quantile Regression for Distributional Reinforcement Learning , 2020, NeurIPS.
[31] P. Schrimpf,et al. Dynamic Programming , 2011 .
[32] Razvan V. Florian,et al. Correct equations for the dynamics of the cart-pole system , 2005 .
[33] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[34] Frederick R. Forst,et al. On robust estimation of the location parameter , 1980 .
[35] R. Mazo. On the theory of brownian motion , 1973 .
[36] D. Allen,et al. Quantile Regression , 2022 .