Learning When-to-Treat Policies
暂无分享,去创建一个
Xinkun Nie | Emma Brunskill | Stefan Wager | Stefan Wager | Emma Brunskill | Xinkun Nie | E. Brunskill
[1] Dimitris Bertsimas,et al. From Predictive to Prescriptive Analytics , 2014, Manag. Sci..
[2] A. Schick. On Asymptotically Efficient Estimation in Semiparametric Models , 1986 .
[3] John Rust. Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher , 1987 .
[4] Donglin Zeng,et al. Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.
[5] William B. Haskell,et al. Empirical Dynamic Programming , 2013, Math. Oper. Res..
[6] M. J. van der Laan,et al. The International Journal of Biostatistics Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules , 2011 .
[7] Stefan Wager,et al. Efficient Policy Learning , 2017, ArXiv.
[8] Xiaohong Chen. Chapter 76 Large Sample Sieve Estimation of Semi-Nonparametric Models , 2007 .
[9] Sören R. Künzel,et al. Metalearners for estimating heterogeneous treatment effects using machine learning , 2017, Proceedings of the National Academy of Sciences.
[10] Victor Chernozhukov,et al. Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .
[11] D. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .
[12] Nan Jiang,et al. Information-Theoretic Considerations in Batch Reinforcement Learning , 2019, ICML.
[13] S. Murphy,et al. Optimal dynamic treatment regimes , 2003 .
[14] David Haussler,et al. Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.
[15] Christoph Dann,et al. Sample Efficient Policy Search for Optimal Stopping Domains , 2017, IJCAI.
[16] Peter Stone,et al. Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation , 2016, AAAI.
[17] Michael R. Kosorok,et al. Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning , 2016, Journal of the American Statistical Association.
[18] A. Belloni,et al. Program evaluation with high-dimensional data , 2013 .
[19] Philip S. Thomas,et al. Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation , 2017, NIPS.
[20] Tianqi Chen,et al. XGBoost: A Scalable Tree Boosting System , 2016, KDD.
[21] Adam N. Elmachtoub,et al. Smart "Predict, then Optimize" , 2017, Manag. Sci..
[22] Mehrdad Farajtabar,et al. More Robust Doubly Robust Off-policy Evaluation , 2018, ICML.
[23] J. Robins,et al. G-Estimation of the Effect of Prophylaxis Therapy for Pneumocystis carinii Pneumonia on the Survival of AIDS Patients , 1992, Epidemiology.
[24] J. Robins,et al. Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .
[25] Csaba Szepesvári,et al. Fitted Q-iteration in continuous action-space MDPs , 2007, NIPS.
[26] Csaba Szepesvári,et al. Finite time bounds for sampling based fitted value iteration , 2005, ICML.
[27] Nathan Kallus,et al. Balanced Policy Evaluation and Learning , 2017, NeurIPS.
[28] Barbara E. Engelhardt,et al. A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units , 2017, UAI.
[29] Ying Liu,et al. Augmented outcome‐weighted learning for estimating optimal dynamic treatment regimens , 2018, Statistics in medicine.
[30] M. Kramer,et al. Estimating Response-Maximized Decision Rules With Applications to Breastfeeding , 2009 .
[31] Philip S. Thomas,et al. Importance Sampling for Fair Policy Selection , 2017, UAI.
[32] James M. Robins,et al. Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .
[33] J. Robins,et al. Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.
[34] W. Newey,et al. Double machine learning for treatment and causal parameters , 2016 .
[35] Masatoshi Uehara,et al. Minimax Weight and Q-Function Learning for Off-Policy Evaluation , 2019, ICML.
[36] Susan Athey,et al. Design-Based Analysis in Difference-in-Differences Settings with Staggered Adoption , 2018, Journal of Econometrics.
[37] S. Athey,et al. Generalized random forests , 2016, The Annals of Statistics.
[38] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.
[39] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[40] Stephen R Cole,et al. Timing of initiation of antiretroviral therapy in AIDS-free HIV-1-infected patients: a collaborative analysis of 18 HIV cohort studies , 2009, The Lancet.
[41] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[42] C. Manski. Statistical treatment rules for heterogeneous populations , 2003 .
[43] Ernesto Mordecki,et al. Optimal stopping and perpetual options for Lévy processes , 2002, Finance Stochastics.
[44] Stefan Wager,et al. Policy Learning With Observational Data , 2017, Econometrica.
[45] Marie Davidian,et al. Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions. , 2013, Biometrika.
[46] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[47] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[48] Qiang Liu,et al. Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation , 2018, NeurIPS.
[49] B. Chakraborty,et al. Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine , 2013 .
[50] Anastasios A. Tsiatis,et al. Dynamic Treatment Regimes , 2019 .
[51] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[52] S. Jacka. Optimal Stopping and the American Put , 1991 .
[53] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[54] D. Rubin,et al. Causal Inference for Statistics, Social, and Biomedical Sciences: A General Method for Estimating Sampling Variances for Standard Estimators for Average Causal Effects , 2015 .
[55] P. Moerbeke. On optimal stopping and free boundary problems , 1973, Advances in Applied Probability.
[56] Emma Brunskill,et al. Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds , 2019, ICML.
[57] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[58] Yao Liu,et al. Representation Balancing MDPs for Off-Policy Policy Evaluation , 2018, NeurIPS.
[59] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[60] Eric B. Laber,et al. Q-Learning: Theory and Applications , 2020, Annual Review of Statistics and Its Application.
[61] Emma Brunskill,et al. Off-Policy Policy Gradient with State Distribution Correction , 2019, UAI 2019.
[62] Stefan Wager,et al. Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.
[63] J. Robins,et al. Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .
[64] J. Robins,et al. Semiparametric Efficiency in Multivariate Regression Models with Missing Data , 1995 .
[65] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[66] P. Moerbeke,et al. On optimal stopping and free boundary problems , 1973, Advances in Applied Probability.
[67] Philip W. Lavori,et al. A design for testing clinical strategies: biased adaptive within‐subject randomization , 2000 .
[68] Suchi Saria,et al. Reliable Decision Support using Counterfactual Models , 2017, NIPS.
[69] Donglin Zeng,et al. New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes , 2015, Journal of the American Statistical Association.
[70] Antoine Chambaz,et al. Faster Rates for Policy Learning , 2017, 1704.06431.
[71] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[72] J. Robins,et al. Estimation and extrapolation of optimal treatment and testing strategies , 2008, Statistics in medicine.
[73] Susan A. Murphy,et al. A Generalization Error for Q-Learning , 2005, J. Mach. Learn. Res..
[74] J. Robins,et al. Marginal Structural Models to Estimate the Joint Causal Effect of Nonrandomized Treatments , 2001 .
[75] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[76] Erica E M Moodie,et al. Demystifying Optimal Dynamic Treatment Regimes , 2007, Biometrics.
[77] Eric B. Laber,et al. Doubly Robust Learning for Estimating Individualized Treatment with Censored Data. , 2015, Biometrika.
[78] Xinkun Nie,et al. Quasi-oracle estimation of heterogeneous treatment effects , 2017, Biometrika.
[79] Zhengyuan Zhou,et al. Offline Multi-Action Policy Learning: Generalization and Optimization , 2018, Oper. Res..
[80] Mark J. van der Laan,et al. One-Step TMLE , 2018 .
[81] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[82] Yichi Zhanga,et al. Estimation of Optimal Treatment Regimes Using Lists , 2018 .
[83] Stijn Vansteelandt,et al. Structural nested models and G-estimation: the partially realized promise , 2014, 1503.01589.
[84] Marie Davidian,et al. Using decision lists to construct interpretable and parsimonious treatment regimes , 2015, Biometrics.
[85] J. Robins. Correcting for non-compliance in randomized trials using structural nested mean models , 1994 .
[86] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .
[87] Alʹbert Nikolaevich Shiri︠a︡ev,et al. Optimal Stopping and Free-Boundary Problems , 2006 .
[88] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..
[89] Marie Davidian,et al. Interpretable Dynamic Treatment Regimes , 2016, Journal of the American Statistical Association.
[90] W. Newey,et al. The asymptotic variance of semiparametric estimators , 1994 .
[91] M. Wicks,et al. Concentration of Measure , 2014 .
[92] Xinkun Nie,et al. Learning Objectives for Treatment Effect Estimation , 2017 .
[93] Shie Mannor,et al. Regularized Policy Iteration with Nonparametric Function Spaces , 2016, J. Mach. Learn. Res..
[94] Devavrat Shah,et al. Q-learning with Nearest Neighbors , 2018, NeurIPS.
[95] Nathan Kallus,et al. Confounding-Robust Policy Improvement , 2018, NeurIPS.
[96] Susan Athey,et al. Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.
[97] John Langford,et al. Doubly Robust Policy Evaluation and Optimization , 2014, ArXiv.
[98] J. Robins. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .
[99] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[100] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[101] Min Zhang,et al. Estimating optimal treatment regimes from a classification perspective , 2012, Stat.
[102] Michael R Kosorok,et al. Residual Weighted Learning for Estimating Individualized Treatment Rules , 2015, Journal of the American Statistical Association.
[103] J. Robins,et al. Locally Robust Semiparametric Estimation , 2016, Econometrica.
[104] Toru Kitagawa,et al. Who should be Treated? Empirical Welfare Maximization Methods for Treatment Choice , 2015 .
[105] Masatoshi Uehara,et al. Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes , 2019, J. Mach. Learn. Res..
[106] A. Belloni,et al. Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.