Collapsing Bandits and Their Application to Public Health Interventions

We propose and study Collpasing Bandits, a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process with a special structure: when an arm is played, the state is fully observed, thus "collapsing" any uncertainty, but when an arm is passive, no observation is made, thus allowing uncertainty to evolve. The goal is to keep as many arms in the "good" state as possible by planning a limited budget of actions per round. Such Collapsing Bandits are natural models for many healthcare domains in which workers must simultaneously monitor patients and deliver interventions in a way that maximizes the health of their patient cohort. Our main contributions are as follows: (i) Building on the Whittle index technique for RMABs, we derive conditions under which the Collapsing Bandits problem is indexable. Our derivation hinges on novel conditions that characterize when the optimal policies may take the form of either "forward" or "reverse" threshold policies. (ii) We exploit the optimality of threshold policies to build fast algorithms for computing the Whittle index, including a closed-form. (iii) We evaluate our algorithm on several data distributions including data from a real-world healthcare task in which a worker must monitor and deliver interventions to maximize their patients' adherence to tuberculosis medication. Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques while achieving similar performance.

[1]  Shan Liu,et al.  Selective sensing of a heterogeneous population of units with dynamic health conditions , 2018, IISE Transactions.

[2]  D. Porignon,et al.  WHO Guideline on health policy and system support to optimize community health worker programmes , 2018 .

[3]  John N. Tsitsiklis,et al.  The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..

[4]  P. Whittle Restless Bandits: Activity Allocation in a Changing World , 1988 .

[5]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[6]  S Kenya,et al.  Can community health workers improve adherence to highly active antiretroviral therapy in the USA? A review of the literature , 2011, HIV medicine.

[7]  Sandy Cairncross,et al.  Motivations and Challenges of Community-Based Surveillance Volunteers in the Northern Region of Ghana , 2012, Journal of Community Health.

[8]  D. Manjunath,et al.  On the Whittle Index for Restless Multiarmed Hidden Markov Bandits , 2016, IEEE Transactions on Automatic Control.

[9]  Aditya Mahajan,et al.  Restless bandits with controlled restarts: Indexability and computation of Whittle index , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[10]  Ambuj Tewari,et al.  Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems , 2019, NeurIPS.

[11]  Sharayu Moharir,et al.  Whittle Index for AoI-Aware Scheduling , 2020, 2020 International Conference on COMmunication Systems & NETworkS (COMSNETS).

[12]  Alicia H Chang,et al.  House calls by community health workers and public health nurses to improve adherence to isoniazid monotherapy for latent tuberculosis infection: a retrospective study , 2013, BMC Public Health.

[13]  J. Unützer,et al.  Monitoring Depression Treatment Outcomes With the Patient Health Questionnaire-9 , 2004, Medical care.

[14]  Qing Zhao,et al.  Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access , 2008, IEEE Transactions on Information Theory.

[15]  Bhaskar Krishnamachari,et al.  Restless Poachers: Handling Exploration-Exploitation Tradeoffs in Security Domains , 2016, AAMAS.

[16]  Kevin Savage,et al.  Community health workers improve disease control and medication adherence among patients with diabetes and/or hypertension in Chiapas, Mexico: an observational stepped-wedge study , 2018, BMJ Global Health.

[17]  John S. Luque,et al.  Do Community Health Worker Interventions Improve Rates of Screening Mammography in the United States? A Systematic Review , 2011, Cancer Epidemiology, Biomarkers & Prevention.

[18]  Umberto Spagnolini,et al.  Optimality of myopic scheduling and whittle indexability for energy harvesting sensors , 2012, 2012 46th Annual Conference on Information Sciences and Systems (CISS).

[19]  Yu-Pin Hsu,et al.  Age of Information: Whittle Index for Scheduling Stochastic Arrivals , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[20]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[21]  S. Villar INDEXABILITY AND OPTIMAL INDEX POLICIES FOR A CLASS OF REINITIALISING RESTLESS BANDITS , 2015, Probability in the Engineering and Informational Sciences.

[22]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[23]  K. Glazebrook,et al.  Some indexable families of restless bandit problems , 2006, Advances in Applied Probability.

[24]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[25]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[26]  Kevin D. Glazebrook,et al.  Whittle's index policy for a multi-class queueing system with convex holding costs , 2003, Math. Methods Oper. Res..

[27]  R. Weber,et al.  On an index policy for restless bandits , 1990, Journal of Applied Probability.

[28]  Omri Weinstein,et al.  Faster Dynamic Matrix Inverse for Faster LPs , 2020, ArXiv.

[29]  Sara J Elazan,et al.  Reproductive, Maternal, Newborn, and Child Health in the Community: Task-sharing Between Male and Female Health Workers in an Indian Rural Context , 2016, Indian journal of community medicine : official publication of Indian Association of Preventive & Social Medicine.

[30]  Simon Karanja,et al.  The Effects on Tuberculosis Treatment Adherence from Utilising Community Health Workers: A Comparison of Selected Rural and Urban Settings in Kenya , 2014, PloS one.

[31]  Paul Farmer,et al.  Community-based treatment of multidrug-resistant tuberculosis in Lima, Peru: 7 years of experience. , 2004, Social science & medicine.

[32]  Olveen Carrasquillo,et al.  Using Community Health Workers to Improve Clinical Outcomes Among People Living with HIV: A Randomized Controlled Trial , 2013, AIDS and Behavior.

[33]  Tracy E. Moran,et al.  Reducing the Risk of Postpartum Depression in a Low-Income Community Through a Community Health Worker Intervention , 2018, Maternal and Child Health Journal.

[34]  Milind Tambe,et al.  Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data , 2019, KDD.

[35]  Prajit K. Dutta,et al.  What do discounted optima converge to?: A theory of discount rate asymptotics in economic models , 1991 .

[36]  Simon Lewin,et al.  Thirty years after Alma-Ata: a systematic review of the impact of community health workers delivering curative interventions against malaria, pneumonia and diarrhoea on child mortality and morbidity in sub-Saharan Africa , 2011, Human resources for health.

[37]  Susan L Norris,et al.  Effectiveness of community health workers in the care of people with hypertension. , 2007, American journal of preventive medicine.

[38]  Viliam Makis,et al.  Group Maintenance: A Restless Bandits Approach , 2019, INFORMS J. Comput..