Uncertainty propagation for quality assurance in Reinforcement Learning
暂无分享,去创建一个
[1] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[2] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[3] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[7] U. Rieder,et al. Markov Decision Processes , 2010 .
[8] Csaba Szepesvári,et al. Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path , 2006, COLT.
[9] Thomas Martinetz,et al. Improving Optimality of Neural Rewards Regression for Data-Efficient Batch Near-Optimal Policy Identification , 2007, ICANN.
[10] John Hallam,et al. IEEE International Joint Conference on Neural Networks , 2005 .
[11] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[12] Shie Mannor,et al. Percentile optimization in uncertain Markov decision processes with application to efficient exploration , 2007, ICML '07.
[13] Giulio D'Agostini,et al. BAYESIAN REASONING IN DATA ANALYSIS: A CRITICAL INTRODUCTION , 2003 .
[14] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[15] Steffen Udluft,et al. A Neural Reinforcement Learning Approach to Gas Turbine Control , 2007, 2007 International Joint Conference on Neural Networks.
[16] Don Coppersmith,et al. Matrix multiplication via arithmetic progressions , 1987, STOC.
[17] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[18] Wilfried Brauer,et al. Fuzzy Model-Based Reinforcement Learning , 2002, Advances in Computational Intelligence and Learning.
[19] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[20] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[21] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[22] George Woodworth,et al. Bayesian Reasoning in Data Analysis: A Critical Introduction , 2004 .
[23] E. Iso,et al. Measurement Uncertainty and Probability: Guide to the Expression of Uncertainty in Measurement , 1995 .
[24] Leonid Peshkin,et al. Bounds on Sample Size for Policy Evaluation in Markov Environments , 2001, COLT/EuroCOLT.
[25] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[26] Mohammad Ghavamzadeh,et al. Bayesian Policy Gradient Algorithms , 2006, NIPS.