Near-Optimal Reinforcement Learning in Dynamic Treatment Regimes
暂无分享,去创建一个
[1] J M Robins,et al. Marginal Mean Models for Dynamic Regimes , 2001, Journal of the American Statistical Association.
[2] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[3] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .
[4] Kamyar Azizzadenesheli,et al. Reinforcement Learning in Rich-Observation MDPs using Spectral Methods , 2016, 1611.03907.
[5] H. Sung,et al. Selecting Therapeutic Strategies Based on Efficacy and Death in Multicourse Clinical Trials , 2002 .
[6] H. Sung,et al. Evaluating multiple treatment courses in clinical trials. , 2000, Statistics in medicine.
[7] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[8] Bibhas Chakraborty,et al. Dynamic treatment regimes for managing chronic health conditions: a statistical perspective. , 2011, American journal of public health.
[9] S. Zionts,et al. Programming with linear fractional functionals , 1968 .
[10] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[11] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[12] C. Manski. Nonparametric Bounds on Treatment Effects , 1989 .
[13] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[14] Elias Bareinboim,et al. Transfer Learning in Multi-Armed Bandit: A Causal Approach , 2017, AAMAS.
[15] James M. Robins,et al. Probabilistic evaluation of sequential plans from causal models with hidden variables , 1995, UAI.
[16] Brian T. Austin,et al. Improving chronic illness care: translating evidence into action. , 2001, Health affairs.
[17] Stephen L. George,et al. Granulocyte–Macrophage Colony-Stimulating Factor after Initial Chemotherapy for Elderly Patients with Primary Acute Myelogenous Leukemia , 1995 .
[18] Elias Bareinboim,et al. General Identifiability with Arbitrary Surrogate Experiments , 2019, UAI.
[19] Franz von Kutschera,et al. Causation , 1993, J. Philos. Log..
[20] Philip W. Lavori,et al. A design for testing clinical strategies: biased adaptive within‐subject randomization , 2000 .
[21] Donald B. Rubin,et al. Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .
[22] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[23] S. Murphy,et al. An experimental design for the development of adaptive treatment strategies , 2005, Statistics in medicine.
[24] E. Ordentlich,et al. Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[26] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[27] Judea Pearl,et al. Counterfactuals and Policy Analysis in Structural Models , 1995, UAI.
[28] Ree Dawson,et al. Adaptive treatment strategies in chronic disease. , 2008, Annual review of medicine.
[29] Anastasios A. Tsiatis,et al. Semiparametric efficient estimation of survival distributions in two-stage randomisation designs in clinical trials with censored data , 2006 .
[30] S. Murphy,et al. Optimal dynamic treatment regimes , 2003 .
[31] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[32] Tom Burr,et al. Causation, Prediction, and Search , 2003, Technometrics.
[33] J. Robins,et al. Estimation and extrapolation of optimal treatment and testing strategies , 2008, Statistics in medicine.
[34] Benjamin Van Roy,et al. Near-optimal Reinforcement Learning in Factored MDPs , 2014, NIPS.
[35] Emma Brunskill,et al. A PAC RL Algorithm for Episodic POMDPs , 2016, AISTATS.
[36] Susan A. Murphy,et al. A Generalization Error for Q-Learning , 2005, J. Mach. Learn. Res..
[37] S. Murphy,et al. Dynamic Treatment Regimes. , 2014, Annual review of statistics and its application.
[38] A. Tsiatis,et al. Optimal Estimator for the Survival Distribution and Related Quantities for Treatment Policies in Two‐Stage Randomization Designs in Clinical Trials , 2004, Biometrics.
[39] D. Rubin,et al. Principal Stratification in Causal Inference , 2002, Biometrics.
[40] Tsuyoshi Murata,et al. {m , 1934, ACML.
[41] Csaba Szepesvári,et al. Model-based reinforcement learning with nearly tight exploration complexity bounds , 2010, ICML.
[42] Qiang Liu,et al. Belief Propagation for Structured Decision Making , 2012, UAI.
[43] B. Chakraborty,et al. Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine , 2013 .
[44] Elias Bareinboim,et al. Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.
[45] Marie Davidian,et al. Estimation of Survival Distributions of Treatment Policies in Two‐Stage Randomization Designs in Clinical Trials , 2002, Biometrics.