Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
暂无分享,去创建一个
[1] Masatoshi Uehara,et al. Minimax Weight and Q-Function Learning for Off-Policy Evaluation , 2019, ICML.
[2] J. Pearl. Causal inference in statistics: An overview , 2009 .
[3] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[4] Nathan Kallus,et al. Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning , 2020, NeurIPS.
[5] Jennie Si,et al. Robust Optimality for Discounted Infinite-Horizon Markov Decision Processes With Uncertain Transition Matrices , 2008, IEEE Transactions on Automatic Control.
[6] Yao Liu,et al. Combining Parametric and Nonparametric Models for Off-Policy Evaluation , 2019, ICML.
[7] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[8] Avi Feller,et al. Algorithmic Decision Making in the Presence of Unmeasured Confounding , 2018, 1805.01868.
[9] S. Murphy,et al. Optimal dynamic treatment regimes , 2003 .
[10] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[11] Eric B. Laber,et al. Dynamic treatment regimes: Technical challenges and applications , 2014 .
[12] Shie Mannor,et al. Off-Policy Evaluation in Partially Observable Environments , 2020, AAAI.
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] Xiaojie Mao,et al. Interval Estimation of Individual-Level Causal Effects Under Unobserved Confounding , 2018, AISTATS.
[15] Bo Dai,et al. Reinforcement Learning via Fenchel-Rockafellar Duality , 2020, ArXiv.
[16] Nathan Kallus,et al. Confounding-Robust Policy Improvement , 2018, NeurIPS.
[17] Qiang Liu,et al. Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation , 2018, NeurIPS.
[18] David Sontag,et al. Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models , 2019, ICML.
[19] Paul R. Rosenbaum,et al. Overt Bias in Observational Studies , 2002 .
[20] Alexander D'Amour,et al. Flexible Sensitivity Analysis for Observational Studies Without Observable Implications , 2018, Journal of the American Statistical Association.
[21] Daniel Kuhn,et al. Robust Markov Decision Processes , 2013, Math. Oper. Res..
[22] Shie Mannor,et al. RL for Latent MDPs: Regret Guarantees and a Lower Bound , 2021, NeurIPS.
[23] Peter Stone,et al. Importance Sampling Policy Evaluation with an Estimated Behavior Policy , 2018, ICML.
[24] Tyler J. VanderWeele,et al. Sensitivity Analysis Without Assumptions , 2015, Epidemiology.
[25] Masatoshi Uehara,et al. Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes , 2019, J. Mach. Learn. Res..
[26] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[27] Zhiqiang Tan,et al. A Distributional Approach for Causal Inference Using Propensity Scores , 2006 .
[28] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.
[29] Emma Brunskill,et al. Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding , 2020, NeurIPS.
[30] Hoang Minh Le,et al. Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning , 2019, NeurIPS Datasets and Benchmarks.
[31] Nathan Kallus,et al. Policy Evaluation with Latent Confounders via Optimal Balance , 2019, NeurIPS.
[32] John Duchi,et al. BOUNDS ON THE CONDITIONAL AND AVERAGE TREATMENT EFFECT WITH UNOBSERVED CONFOUNDING FACTORS. , 2018, Annals of statistics.
[33] D. Rubin,et al. Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome , 1983 .