High Confidence Generalization for Reinforcement Learning
暂无分享,去创建一个
Philip S. Thomas | Georgios Theocharous | Scott M. Jordan | James Kostas | Yash Chandak | James E. Kostas | P. Thomas | Georgios Theocharous | Yash Chandak
[1] Xingyou Song,et al. Observational Overfitting in Reinforcement Learning , 2019, ICLR.
[2] Philip S. Thomas,et al. Concentration Inequalities for Conditional Value at Risk , 2019, ICML.
[3] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[4] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[5] Xingyou Song,et al. The Principle of Unchanged Optimality in Reinforcement Learning Generalization , 2019, ArXiv.
[6] Honglak Lee,et al. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , 2017, ICML.
[7] Yuriy Brun,et al. Preventing undesirable behavior of intelligent machines , 2019, Science.
[8] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[9] Meysam Bastani,et al. Model-Free Intelligent Diabetes Management Using Machine Learning , 2014 .
[10] David B. Brown,et al. Large deviations bounds for estimating conditional value-at-risk , 2007, Oper. Res. Lett..
[11] Michael L. Littman,et al. Measuring and Characterizing Generalization in Deep Reinforcement Learning , 2018, Applied AI Letters.
[12] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[13] Student,et al. THE PROBABLE ERROR OF A MEAN , 1908 .
[14] Andrew G. Barto,et al. Autonomous shaping: knowledge transfer in reinforcement learning , 2006, ICML.
[15] Peter Stone,et al. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..
[16] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[17] Finale Doshi-Velez,et al. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations , 2013, IJCAI.
[18] Kathleen M. Jagodnik,et al. Reinforcement Learning and Feedback Control for High-Level Upper-Extremity Neuroprostheses , 2014 .
[19] Joelle Pineau,et al. A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning , 2018, ArXiv.
[20] Robert F. Kirsch,et al. Combined feedforward and feedback control of a redundant, nonlinear, dynamic musculoskeletal system , 2009, Medical & Biological Engineering & Computing.
[21] Finale Doshi-Velez,et al. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes , 2017, AAAI.
[22] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[23] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[24] Romain Laroche,et al. Safe Policy Improvement with an Estimated Baseline Policy , 2020, AAMAS.
[25] Romain Laroche,et al. Safe Policy Improvement with Baseline Bootstrapping , 2017, ICML.
[26] Richard Socher,et al. On the Generalization Gap in Reparameterizable Reinforcement Learning , 2019, ICML.
[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.