A Cautious Approach to Generalization in Reinforcement Learning
暂无分享,去创建一个
[1] S. Murphy,et al. An experimental design for the development of adaptive treatment strategies , 2005, Statistics in medicine.
[2] J. Ingersoll. Theory of Financial Decision Making , 1987 .
[3] Shie Mannor,et al. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..
[4] Louis Wehenkel,et al. Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[5] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[6] Richard S. Sutton,et al. Dimensions of Reinforcement Learning , 1998 .
[7] D. Ernst. Selecting concise sets of samples for a reinforcement learning agent , 2005 .
[8] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[9] László Monostori,et al. Value Function Based Reinforcement Learning in Changing Markovian Environments , 2008, J. Mach. Learn. Res..
[10] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[11] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[12] Alberto Bemporad,et al. Robust model predictive control: A survey , 1998, Robustness in Identification and Control.
[13] John N. Tsitsiklis,et al. Bias and variance in value function estimation , 2004, ICML.
[14] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[15] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[16] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[17] S. Murphy,et al. Optimal dynamic treatment regimes , 2003 .
[18] Louis Wehenkel,et al. Inferring bounds on the performance of a control policy from a sample of trajectories , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[19] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[20] S. Murphy,et al. PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. , 2011, Annals of statistics.