Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
暂无分享,去创建一个
[1] J. Hammersley,et al. Monte Carlo Methods , 1965 .
[2] J. Halton. A Retrospective and Prospective Survey of the Monte Carlo Method , 1970 .
[3] Alan Weiss,et al. Sensitivity analysis via likelihood ratios , 1986, WSC '86.
[4] Peter W. Glynn,et al. Likelilood ratio gradient estimation: an overview , 1987, WSC '87.
[5] J. Halton. Sequential monte carlo techniques for the solution of linear systems , 1994 .
[6] Alexander J. Smola,et al. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.
[7] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[8] Dennis D. Cox,et al. Adaptive importance sampling on discrete Markov chains , 1999 .
[9] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[10] Vivek S. Borkar,et al. Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..
[11] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[12] John N. Tsitsiklis,et al. Approximate Gradient Methods in Policy-Space Optimization of Markov Reward Processes , 2003, Discret. Event Dyn. Syst..
[13] S. Maire. An iterative computation of approximations on Korobov-like spaces , 2003 .
[14] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[15] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[16] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[17] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[18] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[19] Sylvain Maire,et al. Sequential Control Variates for Functionals of Markov Processes , 2005, SIAM J. Numer. Anal..
[20] P. S. Dwyer. Annals of Applied Probability , 2006 .
[21] P. Glynn. LIKELIHOOD RATIO GRADIENT ESTIMATION : AN OVERVIEW by , 2022 .