Exploring Clustering Techniques for Effective Reinforcement Learning based Personalization for Health and Wellbeing

Personalisation has become omnipresent in society. For the domain of health and wellbeing such personalisation can contribute to better interventions and improved health states of users. In order for personalisation to be effective in this domain, it needs to be performed quickly and with minimal impact on the users. Reinforcement learning is one of the techniques that can be used to establish such personalisation, but it is not known to be very fast at learning. Cluster-based reinforcement learning has been proposed to improve the learning speed. Here, users who show similar behaviour are clustered and one policy is learned for each individual cluster. An important factor in this effort is the method used for clustering, which has the potential to influence the benefit of such an approach. In this paper, we propose three distance metrics based on the state of the users (Euclidean distance, Dynamic Time Warping, and high-level features) and apply different clustering techniques given these distance metrics to study their impact on the overall performance. We evaluate the different methods in a simulator with users spawned from very distinct user profiles as well as overlapping user profiles. The results show that clustering configurations using high-level features significantly outperform regular reinforcement learning without clustering (which either learn one policy for all or one policy per individual).

[1]  Mykola Pechenizkiy,et al.  Towards the framework of adaptive user interfaces for eHealth , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[2]  Moshe Tennenholtz,et al.  Encouraging Physical Activity in Patients With Diabetes Through Automatic Personalized Feedback via Reinforcement Learning Improves Glycemic Control , 2016, Diabetes Care.

[3]  Yoon Ho Cho,et al.  A personalized recommender system based on web usage mining and decision tree induction , 2002, Expert Syst. Appl..

[4]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[5]  Joel J. P. C. Rodrigues,et al.  Mobile-health: A review of current state in 2015 , 2015, J. Biomed. Informatics.

[6]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[7]  Xin Jin,et al.  K-Medoids Clustering , 2010, Encyclopedia of Machine Learning.

[8]  Toon De Pessemier,et al.  Context-aware recommendations through context and activity recognition in a mobile environment , 2014, Multimedia Tools and Applications.

[9]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[10]  Marshall Scott Poole,et al.  What Is Personalization? Perspectives on the Design and Implementation of Personalization in Information Systems , 2006, J. Organ. Comput. Electron. Commer..

[11]  Marco Wiering,et al.  Reinforcement Learning , 2014, Adaptation, Learning, and Optimization.

[12]  Jun Guo,et al.  Group-driven Reinforcement Learning for Personalized mHealth Intervention , 2017, MICCAI.

[13]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[14]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[15]  Jacek M. Leski,et al.  Hierarchical Agglomerative Clustering of Time-Warped Series , 2017, ICMMI.

[16]  Mark Hoogendoorn,et al.  Personalization of Health Interventions using Cluster-Based Reinforcement Learning , 2018, PRIMA.

[17]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.