An Analysis of Categorical Distributional Reinforcement Learning
暂无分享,去创建一个
Yee Whye Teh | Marc G. Bellemare | Rémi Munos | Mark Rowland | Will Dabney | R. Munos | Y. Teh | Mark Rowland | Will Dabney | M. Rowland
[1] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[2] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[3] Marc G. Bellemare,et al. The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.
[4] Moshe Shaked,et al. Stochastic orders and their applications , 1994 .
[5] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[6] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[7] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[8] Shie Mannor,et al. Learning the Variance of the Reward-To-Go , 2016, J. Mach. Learn. Res..
[9] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[10] Masashi Sugiyama,et al. Nonparametric Return Distribution Approximation for Reinforcement Learning , 2010, ICML.
[11] J. Norris. Appendix: probability and measure , 1997 .
[12] Patrick Billingsley,et al. Probability and Measure. , 1986 .
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[15] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[16] Mohammad Ghavamzadeh,et al. Actor-Critic Algorithms for Risk-Sensitive MDPs , 2013, NIPS.
[17] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[18] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[19] Masashi Sugiyama,et al. Parametric Return Density Estimation for Reinforcement Learning , 2010, UAI.