An empirical study on prediction of population health through social media

Public health measurement is important for government administration as it provides indicators and implications to public healthcare strategies. The measurement of health status has been traditionally conducted via surveys in the forms of pre-designed questionnaires to collect responses from targeted participants. Apart from benefits, traditional approach is costly, time-consuming, and not scalable. These limitations make a major obstacle to policy makers to develop up-to-date healthcare programs. This paper studies the use of health-related information conveyed in user-generated content from social media for prediction of health outcomes at population level. Specifically, we investigate linguistic features for analysing textual data. We propose the use of visual features learnt from deep neural networks for understanding visual data. We introduce collective social capital information from location-based social media data. We conducted extensive experiments on large-scale datasets collected from two online social networks: Foursquare and Flickr, against the task of prediction of the U.S. county health indices. Experimental results showed that visual and collective social capital data achieved comparable prediction performance and outperformed textual information. These promising results also suggest the potential of social media for health analysis at population scales.

[1]  Munmun De Choudhury,et al.  Modeling and Understanding Visual Attributes of Mental Health Disclosures in Social Media , 2017, CHI.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Venkata Rama Kiran Garimella,et al.  Social Media Image Analysis for Public Health , 2015, CHI.

[4]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Megha Agrawal,et al.  Characterizing Geographic Variation in Well-Being Using Tweets , 2013, ICWSM.

[6]  Svetha Venkatesh,et al.  Connectivity, Online Social Capital, and Mood: A Bayesian Nonparametric Analysis , 2013, IEEE Transactions on Multimedia.

[7]  Giovanni Quattrone,et al.  City form and well-being: what makes London neighborhoods good places to live? , 2016, SIGSPATIAL/GIS.

[8]  J. Pennebaker,et al.  Confronting a traumatic event: toward an understanding of inhibition and disease. , 1986, Journal of abnormal psychology.

[9]  Mao Ye,et al.  Location recommendation for location-based social networks , 2010, GIS '10.

[10]  J. House,et al.  Social relationships and health. , 1988, Science.

[11]  Shashi Shekhar,et al.  Spatiotemporal Data Mining: A Computational Perspective , 2015, ISPRS Int. J. Geo Inf..

[12]  N. Eagle,et al.  Network Diversity and Economic Development , 2010, Science.

[13]  Stephanie M. Roldan,et al.  Object Recognition in Mental Representations: Directions for Exploring Diagnostic Features through Visual Mental Imagery , 2017, Front. Psychol..

[14]  R. Putnam Bowling Alone: America's Declining Social Capital , 1995 .

[15]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[16]  Sofiane Abbar,et al.  You Tweet What You Eat: Studying Food Consumption Through Twitter , 2014, CHI.

[17]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[18]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[19]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[20]  Aron Culotta,et al.  Estimating county health statistics with twitter , 2014, CHI.

[21]  I. Kawachi Commentary: social capital and health: making the connections one step at a time. , 2006, International journal of epidemiology.

[22]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[23]  Hanan Samet,et al.  The picture of health: map-based, collaborative spatio-temporal disease tracking , 2012, HealthGIS '12.

[24]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[25]  Munmun De Choudhury,et al.  Characterizing Dietary Choices, Nutrition, and Language in Food Deserts via Social Media , 2016, CSCW.

[26]  Mark Dredze,et al.  How Social Media Will Change Public Health , 2012, IEEE Intelligent Systems.

[27]  Kakoli Roy,et al.  Measuring the Public's Health , 2006, Public health reports.

[28]  Eric Horvitz,et al.  Eyewitness: identifying local events via space-time signals in twitter feeds , 2015, SIGSPATIAL/GIS.

[29]  Katarzyna Musial,et al.  Social Capital in Online Social Networks , 2006, KES.

[30]  Svetha Venkatesh,et al.  Online Social Capital: Mood, Topical and Psycholinguistic Analysis , 2021, ICWSM.

[31]  Christopher M. Danforth,et al.  Instagram photos reveal predictive markers of depression , 2016, EPJ Data Science.

[32]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[33]  A. Almedom,et al.  Social capital and mental health: an interdisciplinary review of primary evidence. , 2005, Social science & medicine.

[34]  Rumi Chunara,et al.  Socio-spatial Self-organizing Maps , 2018, Proceedings of the ACM on human-computer interaction.

[35]  Eric Horvitz,et al.  Social media as a measurement tool of depression in populations , 2013, WebSci.

[36]  Hiroki Sayama,et al.  Visualizing the "heartbeat" of a city with tweets , 2014, Complex..

[37]  Ray Oldenburg,et al.  The great good place : cafés, coffee shops, community centers, beauty parlors, general stores, bars, hangouts, and how they get you through the day , 1991 .

[38]  Tat-Seng Chua,et al.  From Tweets to Wellness: Wellness Event Detection from Twitter Streams , 2016, AAAI.

[39]  Daniele Quercia,et al.  The Social World of Twitter: Topics, Geography, and Emotions , 2012, ICWSM.

[40]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[41]  R. G. Parrish Peer Reviewed: Measuring Population Health Outcomes , 2010 .