Combining Parametric and Nonparametric Models for Off-Policy Evaluation
暂无分享,去创建一个
Yao Liu | Finale Doshi-Velez | Emma Brunskill | Omer Gottesman | Scott Sussex | Omer Gottesman | Yao Liu | Scott Sussex | E. Brunskill | F. Doshi-Velez
[1] Uri Shalit,et al. Learning Representations for Counterfactual Inference , 2016, ICML.
[2] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[3] Uri Shalit,et al. Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.
[4] Johan Pallud,et al. A Tumor Growth Inhibition Model for Low-Grade Glioma Treated with Chemotherapy or Radiotherapy , 2012, Clinical Cancer Research.
[5] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[6] M. Ghavamzadeh,et al. Robust Policy Optimization with Baseline Guarantees , 2015, 1506.04514.
[7] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[8] Srivatsan Srinivasan,et al. Evaluating Reinforcement Learning Algorithms in Observational Health Settings , 2018, ArXiv.
[9] Philip S. Thomas,et al. Safe Reinforcement Learning , 2015 .
[10] Kavosh Asadi,et al. Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.
[11] Peter Stone,et al. Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation , 2016, AAAI.
[12] Louis Wehenkel,et al. Batch mode reinforcement learning based on the synthesis of artificial trajectories , 2013, Ann. Oper. Res..
[13] Yao Liu,et al. Representation Balancing MDPs for Off-Policy Policy Evaluation , 2018, NeurIPS.
[14] Andrew Slavin Ross,et al. Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning , 2018, AMIA.
[15] Louis Wehenkel,et al. Model-Free Monte Carlo-like Policy Evaluation , 2010, AISTATS.
[16] Qiang Liu,et al. Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation , 2018, NeurIPS.
[17] Louis Wehenkel,et al. Clinical data based optimal STI strategies for HIV: a reinforcement learning approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.
[18] B. P. Zhang,et al. Estimation of the Lipschitz constant of a function , 1996, J. Glob. Optim..
[19] Finale Doshi-Velez,et al. Combining Kernel and Model Based Learning for HIV Therapy Selection , 2017, CRI.
[20] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[21] Andrew Slavin Ross,et al. Improving counterfactual reasoning with kernelised dynamic mixing models , 2018, PloS one.
[22] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[23] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.
[24] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[25] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[26] Mehrdad Farajtabar,et al. More Robust Doubly Robust Off-policy Evaluation , 2018, ICML.