Deep Bayesian Quadrature Policy Optimization
暂无分享,去创建一个
[1] A. Rollett,et al. The Monte Carlo Method , 2004 .
[2] Andrew Gordon Wilson,et al. Deep Kernel Learning , 2015, AISTATS.
[3] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[4] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[5] Gang Niu,et al. Analysis and Improvement of Policy Gradient Estimation , 2011, NIPS.
[6] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[7] Larry Rudolph,et al. Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms? , 2018, ArXiv.
[8] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[9] Kamyar Azizzadenesheli,et al. Efficient Exploration Through Bayesian Deep Q-Networks , 2018, 2018 Information Theory and Applications Workshop (ITA).
[10] J. Baxter,et al. Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).
[11] S. Kakade,et al. Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes , 2019, COLT.
[12] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[13] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[14] Michal Valko,et al. Bayesian Policy Gradient and Actor-Critic Algorithms , 2016, J. Mach. Learn. Res..
[15] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[16] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[17] Andrew Gordon Wilson,et al. Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP) , 2015, ICML.
[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[19] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[20] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..
[21] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[22] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[23] Michael A. Osborne,et al. Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[24] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[25] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[26] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[27] Richard E. Turner. Statistical models for natural sounds , 2010 .
[28] Yuesheng Xu,et al. Universal Kernels , 2006, J. Mach. Learn. Res..
[29] Mohammad Ghavamzadeh,et al. Bayesian Policy Gradient Algorithms , 2006, NIPS.
[30] Michael A. Osborne,et al. Frank-Wolfe Bayesian Quadrature: Probabilistic Integration with Theoretical Guarantees , 2015, NIPS.
[31] Kenji Fukumizu,et al. Convergence Analysis of Deterministic Kernel-Based Quadrature Rules in Misspecified Settings , 2017, Foundations of Computational Mathematics.
[32] Andrew Gordon Wilson,et al. Fast Kernel Learning for Multidimensional Pattern Extrapolation , 2014, NIPS.
[33] Sham M. Kakade,et al. On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift , 2019, J. Mach. Learn. Res..
[34] A. O'Hagan,et al. Bayes–Hermite quadrature , 1991 .
[35] Risto Miikkulainen,et al. Online kernel selection for Bayesian reinforcement learning , 2008, ICML '08.
[36] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[37] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[38] Andrew Gordon Wilson,et al. GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , 2018, NeurIPS.
[39] Francis R. Bach,et al. On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..
[40] Kenji Fukumizu,et al. Convergence guarantees for kernel-based quadrature rules in misspecified settings , 2016, NIPS.
[41] B. Silverman,et al. Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .