暂无分享,去创建一个
[1] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[2] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[3] A. Karimi,et al. Master‟s thesis , 2011 .
[4] Kazuo Tanaka,et al. An approach to fuzzy control of nonlinear systems: stability and design issues , 1996, IEEE Trans. Fuzzy Syst..
[5] Theo Gasser,et al. A Unifying Approach to Nonparametric Regression Estimation , 1988 .
[6] Matthew W. Hoffman,et al. Finite-Sample Analysis of Lasso-TD , 2011, ICML.
[7] Yichuan Zhang,et al. Advances in Neural Information Processing Systems 25 , 2012 .
[8] Marek Petrik,et al. Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.
[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[10] Oliver Kroemer,et al. A Non-Parametric Approach to Dynamic Programming , 2011, NIPS.
[11] Warrren B Powell,et al. Convergence Analysis of Kernel-based On-policy Approximate Policy Iteration Algorithms for Markov Decision Processes with Continuous, Multidimensional States and Actions , 2010 .
[12] Luc Devroye,et al. The uniform convergence of nearest neighbor regression function estimators and their application in optimization , 1978, IEEE Trans. Inf. Theory.
[13] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[14] Ronald Parr,et al. Linear Complementarity for Regularized Policy Evaluation and Improvement , 2010, NIPS.
[15] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[16] B. A. Pires,et al. Statistical analysis of L1-penalized linear estimation with applications , 2012 .
[17] Vivek F. Farias,et al. A Smoothed Approximate Linear Program , 2009, NIPS.
[18] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[19] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .
[20] L. Devroye. The uniform convergence of the nadaraya‐watson regression function estimate , 1978 .