Non-Parametric Approximate Linear Programming for MDPs
暂无分享,去创建一个
[1] Gavin Taylor,et al. Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs , 2012, UAI.
[2] Shie Mannor,et al. Regularized Policy Iteration , 2008, NIPS.
[3] Jason Pazis,et al. Reinforcement learning in multidimensional continuous action spaces , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[4] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[5] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.
[6] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[7] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[8] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[9] Kazuo Tanaka,et al. An approach to fuzzy control of nonlinear systems: stability and design issues , 1996, IEEE Trans. Fuzzy Syst..
[10] Gavin Taylor,et al. Kernelized value function approximation for reinforcement learning , 2009, ICML '09.
[11] Branislav Kveton,et al. Kernel-Based Reinforcement Learning on Representative States , 2012, AAAI.
[12] Oliver Kroemer,et al. A Non-Parametric Approach to Dynamic Programming , 2011, NIPS.
[13] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[14] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..
[17] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[18] Jason Pazis,et al. Generalized Value Functions for Large Action Sets , 2011, ICML.
[19] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[20] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[21] Marek Petrik,et al. Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.
[22] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .