State Relevance for Off-Policy Evaluation
暂无分享,去创建一个
Finale Doshi-Velez | Omer Gottesman | Simon P. Shen | Yecheng Ma | Omer Gottesman | Yecheng Ma | F. Doshi-Velez | S. Shen
[1] Yifei Ma,et al. Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling , 2019, NeurIPS.
[2] B. L. Welch. ON THE COMPARISON OF SEVERAL MEAN VALUES: AN ALTERNATIVE APPROACH , 1951 .
[3] Marcello Restelli,et al. Optimistic Policy Optimization via Multiple Importance Sampling , 2019, ICML.
[4] Qiang Liu,et al. Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation , 2018, NeurIPS.
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Yi Su,et al. Doubly robust off-policy evaluation with shrinkage , 2019, ICML.
[7] J M Robins,et al. Marginal Mean Models for Dynamic Regimes , 2001, Journal of the American Statistical Association.
[8] Philip S. Thomas,et al. High Confidence Policy Improvement , 2015, ICML.
[9] E. Ionides. Truncated Importance Sampling , 2008 .
[10] Philip S. Thomas,et al. High-Confidence Off-Policy Evaluation , 2015, AAAI.
[11] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[12] Mehrdad Farajtabar,et al. More Robust Doubly Robust Off-policy Evaluation , 2018, ICML.
[13] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[14] T. Schaul,et al. Conditional Importance Sampling for Off-Policy Learning , 2019, AISTATS.
[15] Philip S. Thomas,et al. Importance Sampling for Fair Policy Selection , 2017, UAI.
[16] Sergey Levine,et al. Offline policy evaluation across representations with applications to educational games , 2014, AAMAS.
[17] Philip S. Thomas,et al. Safe Reinforcement Learning , 2015 .
[18] Philip S. Thomas,et al. Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation , 2017, NIPS.
[19] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[20] J. L. Hodges,et al. The significance probability of the smirnov two-sample test , 1958 .
[21] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[22] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[23] Yao Liu,et al. Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions , 2020, ICML.
[24] Marcello Restelli,et al. Importance Sampling Techniques for Policy Optimization , 2020, J. Mach. Learn. Res..