暂无分享,去创建一个
Kristjan H. Greenewald | Predrag Klasnja | Peng Liao | Kristjan Greenewald | Susan Murphy | S. Murphy | P. Klasnja | Peng Liao | K. Greenewald
[1] Anil Aswani,et al. Non-Stationary Bandits with Habituation and Recovery Dynamics , 2017, Oper. Res..
[2] Daniel E. Rivera,et al. Development of a Control-Oriented Model of Social Cognitive Theory for Optimized mHealth Behavioral Interventions , 2020, IEEE Transactions on Control Systems Technology.
[3] Damien Ernst,et al. How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies , 2015, ArXiv.
[4] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[5] Benjamin Van Roy,et al. Why is Posterior Sampling Better than Optimism for Reinforcement Learning? , 2016, ICML.
[6] Benjamin Van Roy,et al. On Optimistic versus Randomized Exploration in Reinforcement Learning , 2017, ArXiv.
[7] Stephanie T. Lanza,et al. Control Engineering Methods for the Design of Robust Behavioral Treatments , 2017, IEEE Transactions on Control Systems Technology.
[8] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[9] Santiago Ontañón,et al. Can the artificial intelligence technique of reinforcement learning use continuously-monitored digital data to optimize treatment for weight loss? , 2018, Journal of Behavioral Medicine.
[10] Ambuj Tewari,et al. Just-in-Time Adaptive Interventions (JITAIs) in Mobile Health: Key Components and Design Principles for Ongoing Health Behavior Support , 2017, Annals of behavioral medicine : a publication of the Society of Behavioral Medicine.
[11] Benjamin Van Roy,et al. A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..
[12] S. Murphy,et al. Assessing Time-Varying Causal Effect Moderation in Mobile Health , 2016, Journal of the American Statistical Association.
[13] Daniel E. Rivera,et al. Intensively Adaptive Interventions Using Control Systems Engineering: Two Illustrative Examples , 2018 .
[14] Predrag V. Klasnja,et al. Using wearable sensors and real time inference to understand human recall of routine activities , 2008, UbiComp.
[15] S. Zeger,et al. Longitudinal data analysis using generalized linear models , 1986 .
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[18] Inbal Nahum-Shani,et al. Randomised trials for the Fitbit generation , 2015, Significance.
[19] Mi Zhang,et al. MyBehavior: automatic personalized health feedback from user behaviors and preferences using smartphones , 2015, UbiComp.
[20] Romain Laroche,et al. On Value Function Representation of Long Horizon Problems , 2018, AAAI.
[21] Susan Athey,et al. Estimation Considerations in Contextual Bandits , 2017, ArXiv.
[22] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[23] Kristjan H. Greenewald,et al. Action Centered Contextual Bandits , 2017, NIPS.
[24] Ambuj Tewari,et al. From Ads to Interventions: Contextual Bandits in Mobile Health , 2017, Mobile Health - Sensors, Analytic Methods, and Applications.
[25] Kenneth Y. Goldberg,et al. Personalizing Mobile Fitness Apps using Reinforcement Learning , 2018, IUI Workshops.
[26] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[27] Nan Jiang,et al. The Dependence of Effective Planning Horizon on Model Accuracy , 2015, AAMAS.
[28] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[29] Ambuj Tewari,et al. Microrandomized trials: An experimental design for developing just-in-time adaptive interventions. , 2015, Health psychology : official journal of the Division of Health Psychology, American Psychological Association.
[30] Ran Gilad-Bachrach,et al. PopTherapy: coping with stress through pop-culture , 2014, PervasiveHealth.
[31] Nicholas J. Seewald,et al. Efficacy of Contextually Tailored Suggestions for Physical Activity: A Micro-randomized Optimization Trial of HeartSteps. , 2018, Annals of behavioral medicine : a publication of the Society of Behavioral Medicine.
[32] Aditya Gopalan,et al. Misspecified Linear Bandits , 2017, AAAI.
[33] Yi Ouyang,et al. Learning Unknown Markov Decision Processes: A Thompson Sampling Approach , 2017, NIPS.
[34] Moshe Tennenholtz,et al. Encouraging Physical Activity in Patients With Diabetes: Intervention Using a Reinforcement Learning System , 2017, Journal of medical Internet research.
[35] Akshay Krishnamurthy,et al. Semiparametric Contextual Bandits , 2018, ICML.
[36] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[37] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[38] Ambuj Tewari,et al. Sample size calculations for micro‐randomized trials in mHealth , 2015, Statistics in medicine.
[39] Christopher Grimm,et al. Mitigating Planner Overfitting in Model-Based Reinforcement Learning , 2018, ArXiv.
[40] Susan A. Murphy,et al. Just-in-Time but Not Too Much , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..
[41] Rémi Munos,et al. An Optimistic Posterior Sampling Strategy for Bayesian Reinforcement Learning , 2013 .
[42] M. Dimitrijevic,et al. Habituation: effects of regular and stochastic stimulation , 1972, Journal of neurology, neurosurgery, and psychiatry.