#Healthy Selfies: Exploration of Health Topics on Instagram

BACKGROUND Social media provides a complementary source of information for public health surveillance. The dominate data source for this type of monitoring is the microblogging platform Twitter, which is convenient due to the free availability of public data. Less is known about the utility of other social media platforms, despite their popularity. OBJECTIVE This work aims to characterize the health topics that are prominently discussed in the image-sharing platform Instagram, as a step toward understanding how this data might be used for public health research. METHODS The study uses a topic modeling approach to discover topics in a dataset of 96,426 Instagram posts containing hashtags related to health. We use a polylingual topic model, initially developed for datasets in different natural languages, to model different modalities of data: hashtags, caption words, and image tags automatically extracted using a computer vision tool. RESULTS We identified 47 health-related topics in the data (kappa=.77), covering ten broad categories: acute illness, alternative medicine, chronic illness and pain, diet, exercise, health care & medicine, mental health, musculoskeletal health and dermatology, sleep, and substance use. The most prevalent topics were related to diet (8,293/96,426; 8.6% of posts) and exercise (7,328/96,426; 7.6% of posts). CONCLUSIONS A large and diverse set of health topics are discussed in Instagram. The extracted image tags were generally too coarse and noisy to be used for identifying posts but were in some cases accurate for identifying images relevant to studying diet and substance use. Instagram shows potential as a source of public health information, though limitations in data collection and metadata availability may limit its use in comparison to platforms like Twitter.

[1]  Venkata Rama Kiran Garimella,et al.  Social Media Image Analysis for Public Health , 2015, CHI.

[2]  Roy Cherian,et al.  Representations of Codeine Misuse on Instagram: Content Analysis , 2018, JMIR public health and surveillance.

[3]  R. Guha,et al.  What are we ‘tweeting’ about obesity? Mapping tweets with topic modeling and Geographic Information System , 2013, Cartography and geographic information science.

[4]  Ingmar Weber,et al.  Pro-Anorexia and Pro-Recovery Photo Sharing: A Tale of Two Warring Tribes , 2012, Journal of medical Internet research.

[5]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[6]  John S. Brownstein,et al.  Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report Data , 2017, PLoS neglected tropical diseases.

[7]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[8]  Jon-Patrick Allem,et al.  Images of Little Cigars and Cigarillos on Instagram Identified by the Hashtag #swisher: Thematic Analysis , 2017, Journal of medical Internet research.

[9]  B. Bonevski,et al.  Reaching the hard-to-reach: a systematic review of strategies for improving health and medical research with socially disadvantaged groups , 2014, BMC Medical Research Methodology.

[10]  Hamed Haddadi,et al.  #FoodPorn: Obesity Patterns in Culinary Interactions , 2015, Digital Health.

[11]  Jon-Patrick Allem,et al.  Waterpipe Promotion and Use on Instagram: #Hookah , 2017, Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco.

[12]  Mark Dredze,et al.  Social Monitoring for Public Health , 2017, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[13]  Sara Santarossa,et al.  #fitspo on Instagram: A mixed-methods approach using Netlytic and photo analysis, uncovering the online discussion and author/image characteristics , 2019, Journal of health psychology.

[14]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[15]  Michael D. Barnes,et al.  Tweaking and Tweeting: Exploring Twitter for Nonmedical Use of a Psychostimulant Drug (Adderall) Among College Students , 2013, Journal of medical Internet research.

[16]  Daniel Fried,et al.  Analyzing the language of food on social media , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[17]  Munmun De Choudhury,et al.  Characterizing Dietary Choices, Nutrition, and Language in Food Deserts via Social Media , 2016, CSCW.

[18]  Valerio Basile,et al.  Exploring the association between problem drinking and language use on Facebook in young adults , 2019, Heliyon.

[19]  Mark Dredze,et al.  Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance , 2015, PLoS Comput. Biol..

[20]  Timothy Baldwin,et al.  langid.py: An Off-the-shelf Language Identification Tool , 2012, ACL.

[21]  Kar-Hai Chu,et al.  Vaping on Instagram: cloud chasing, hand checks and product placement , 2016, Tobacco Control.

[22]  Michael J. Paul,et al.  Zika discourse in the Americas: A multilingual topic analysis of Twitter , 2019, PloS one.

[23]  Mark Dredze,et al.  Examining Patterns of Influenza Vaccination in Social Media , 2017, AAAI Workshops.

[24]  D. Fishbain,et al.  Chronic pain-associated depression: antecedent or consequence of chronic pain? A review. , 1997, The Clinical journal of pain.

[25]  Henry A. Kautz,et al.  Deploying nEmesis: Preventing Foodborne Illness by Data Mining Social Media , 2016, AI Mag..

[26]  Christophe G. Giraud-Carrier,et al.  Prevalence and Attitudes about Illicit and Prescription Drugs on Twitter , 2016, 2016 IEEE International Conference on Healthcare Informatics (ICHI).

[27]  Maeve Duggan,et al.  Social Media Update 2016 , 2016 .

[28]  Christophe G. Giraud-Carrier,et al.  Identifying Health-Related Topics on Twitter - An Exploration of Tobacco-Related Tweets as a Test Topic , 2011, SBP.

[29]  Michael J. Paul,et al.  Discovering Health Topics in Social Media Using Topic Models , 2014, PloS one.

[30]  Mark Dredze,et al.  Shared Task : Depression and PTSD on Twitter , 2015 .

[31]  Laura J. Bierut,et al.  Marijuana-Related Posts on Instagram , 2016, Prevention Science.

[32]  Sree Priyanka Uppu,et al.  E-Cigarette Surveillance With Social Media Data: Social Bots, Emerging Topics, and Trends , 2017, JMIR public health and surveillance.

[33]  Mark Dredze,et al.  Exploring Health Topics in Chinese Social Media: An Analysis of Sina Weibo , 2014, AAAI 2014.

[34]  Robert Pless,et al.  Recognizing Images of Eating Disorders in Social Media , 2017, SMM4H@AMIA.

[35]  Erin Willis,et al.  Online Health Communities and Chronic Disease Self-Management , 2017, Health communication.

[36]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[37]  Ming Wen,et al.  Building a National Neighborhood Dataset From Geotagged Twitter Data for Indicators of Happiness, Diet, and Physical Activity , 2016, JMIR public health and surveillance.

[38]  Eugene Agichtein,et al.  Did You Really Just Have a Heart Attack?: Towards Robust Detection of Personal Health Mentions in Social Media , 2018, WWW.

[39]  Henry A. Kautz,et al.  Modeling Spread of Disease from Social Interactions , 2012, ICWSM.

[40]  Catherine Ordun,et al.  Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization , 2013, Online Journal of Public Health Informatics.

[41]  Mark Dredze,et al.  A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews , 2014, J. Am. Medical Informatics Assoc..

[42]  Shion Guha,et al.  Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence? , 2017, J. Assoc. Inf. Sci. Technol..