Deep Radial-Basis Value Functions for Continuous Control
暂无分享,去创建一个
Kavosh Asadi | Michael L. Littman | Neev Parikh | Ronald E. Parr | George D. Konidaris | M. Littman | G. Konidaris | Kavosh Asadi | Neev Parikh
[1] R Bellman,et al. On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] M. J. D. Powell,et al. Radial basis functions for multivariable interpolation: a review , 1987 .
[4] D. Broomhead,et al. Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .
[5] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.
[6] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[7] Franz Aurenhammer,et al. Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.
[8] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] Michel Benaïm,et al. On Functional Approximation with Normalized Gaussian Units , 1994, Neural Comput..
[11] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[12] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[13] Guido Bugmann,et al. Normalized Gaussian Radial Basis Function networks , 1998, Neurocomputing.
[14] Nicolaos B. Karayiannis,et al. Reformulated radial basis neural networks trained by gradient descent , 1999, IEEE Trans. Neural Networks.
[15] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[18] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[19] Jason Pazis,et al. Generalized Value Functions for Large Action Sets , 2011, ICML.
[20] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[21] Alborz Geramifard,et al. A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning , 2013, Found. Trends Mach. Learn..
[22] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[23] Marlos C. Machado,et al. Domain-Independent Optimistic Initialization for Reinforcement Learning , 2014, AAAI Workshop: Learning for General Competency in Video Games.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[26] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[27] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[28] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[29] Kavosh Asadi,et al. An Alternative Softmax Operator for Reinforcement Learning , 2016, ICML.
[30] Lei Xu,et al. Input Convex Neural Networks : Supplementary Material , 2017 .
[31] Martha White,et al. Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces , 2018, ArXiv.
[32] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[33] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[34] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[35] Philip S. Thomas,et al. Learning Action Representations for Reinforcement Learning , 2019, ICML.
[36] Lawrence Carin,et al. Revisiting the Softmax Bellman Operator: New Benefits and New Perspective , 2018, ICML.
[37] Sean R. Sinclair,et al. Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces , 2019, Proc. ACM Meas. Anal. Comput. Syst..
[38] Michal Valko,et al. Regret Bounds for Kernel-Based Reinforcement Learning , 2020, ArXiv.
[39] Craig Boutilier,et al. CAQL: Continuous Action Q-Learning , 2019, ICLR.
[40] Marc G. Bellemare,et al. Zooming for Efficient Model-Free Reinforcement Learning in Metric Spaces , 2020, ArXiv.