暂无分享,去创建一个
Radu State | Gaston Ormazabal | Jeremy Charlier | Jean Hilger | Gaston Ormazabal | R. State | Jean Hilger | Jérémy Charlier
[1] R. C. Merton,et al. Theory of Rational Option Pricing , 2015, World Scientific Reference on Contingent Claims Analysis in Corporate Finance.
[2] J. Hull. Options, Futures, and Other Derivatives , 1989 .
[3] Yann LeCun,et al. Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[4] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[7] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[8] F. Black,et al. The Pricing of Options and Corporate Liabilities , 1973, Journal of Political Economy.
[9] Paul Wilmott,et al. Paul Wilmott on Quantitative Finance , 2010 .
[10] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[11] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[12] Susan A. Murphy,et al. A Generalization Error for Q-Learning , 2005, J. Mach. Learn. Res..
[13] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[14] Oldrich A. Vasicek. An equilibrium characterization of the term structure , 1977 .
[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[16] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[17] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[18] Martin A. Riedmiller,et al. Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[19] Igor Halperin. QLBS: Q-Learner in the Black-Scholes (-Merton) Worlds , 2017, ArXiv.
[20] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[21] Ajay Kumar Tanwani,et al. Autonomous reinforcement learning with experience replay. , 2013, Neural networks : the official journal of the International Neural Network Society.
[22] Muhammad Ghifary,et al. Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies , 2015, ArXiv.
[23] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[24] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[25] H. Robbins. A Stochastic Approximation Method , 1951 .
[26] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[27] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[28] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[29] Peter Dayan,et al. Q-learning , 1992, Machine Learning.