Policy Gradient in Continuous Time
暂无分享,去创建一个
[1] Michael A. Arbib,et al. Topics in Mathematical System Theory , 1969 .
[2] Alan Weiss,et al. Sensitivity analysis via likelihood ratios , 1986, WSC '86.
[3] Peter W. Glynn,et al. Likelilood ratio gradient estimation: an overview , 1987, WSC '87.
[4] H. Kushner,et al. A Monte Carlo method for sensitivity analysis and parametric optimization of nonlinear stochastic systems , 1991 .
[5] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .
[6] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.
[7] M. Talagrand. A new look at independence , 1996 .
[8] Harold J. Kushner,et al. Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.
[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[11] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[12] M. Ledoux. The concentration of measure phenomenon , 2001 .
[13] John N. Tsitsiklis,et al. Approximate Gradient Methods in Policy-Space Optimization of Markov Reward Processes , 2003, Discret. Event Dyn. Syst..
[14] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[15] Rémi Munos,et al. Sensitivity Analysis Using It[o-circumflex]--Malliavin Calculus and Martingales, and Application to Stochastic Optimal Control , 2005, SIAM J. Control. Optim..
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Steven M. LaValle,et al. Planning algorithms , 2006 .