A scoping review of the use of Twitter for public health research

Abstract Public health practitioners and researchers have used traditional medical databases to study and understand public health for a long time. Recently, social media data, particularly Twitter, has seen some use for public health purposes. Every large technological development in history has had an impact on the behaviour of society. The advent of the internet and social media is no different. Social media creates public streams of communication, and scientists are starting to understand that such data can provide some level of access into the people's opinions and situations. As such, this paper aims to review and synthesize the literature on Twitter applications for public health, highlighting current research and products in practice. A scoping review methodology was employed and four leading health, computer science and cross-disciplinary databases were searched. A total of 755 articles were retreived, 92 of which met the criteria for review. From the reviewed literature, six domains for the application of Twitter to public health were identified: (i) Surveillance; (ii) Event Detection; (iii) Pharmacovigilance; (iv) Forecasting; (v) Disease Tracking; and (vi) Geographic Identification. From our review, we were able to obtain a clear picture of the use of Twitter for public health. We gained insights into interesting observations such as how the popularity of different domains changed with time, the diseases and conditions studied and the different approaches to understanding each disease, which algorithms and techniques were popular with each domain, and more.

[1]  Yu-Chuan Li,et al.  Utilizing different word representation methods for twitter data in adverse drug reactions extraction , 2015, 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI).

[2]  Alok N. Choudhary,et al.  Mining social media streams to improve public health allergy surveillance , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[3]  Nadir Weibel,et al.  Analyzing social media to characterize local HIV at-risk populations , 2015, Wireless Health.

[4]  M. Shigematsu,et al.  Using Social Media for Actionable Disease Surveillance and Outbreak Management: A Systematic Literature Review , 2015, PloS one.

[5]  Adrian B. R. Shatte,et al.  Machine learning in mental health: a scoping review of methods and applications , 2019, Psychological Medicine.

[6]  Olga Baysal,et al.  Mining Twitter Data for Influenza Detection and Surveillance , 2016, 2016 IEEE/ACM International Workshop on Software Engineering in Healthcare Systems (SEHS).

[7]  Marwan Bikdash,et al.  Hybrid classification for tweets related to infection with influenza , 2015, SoutheastCon 2015.

[8]  Ming-Hsiang Tsou,et al.  Applying GIS and Machine Learning Methods to Twitter Data for Multiscale Surveillance of Influenza , 2016, PloS one.

[9]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[10]  K Denecke,et al.  How to Exploit Twitter for Public Health Monitoring? , 2013, Methods of Information in Medicine.

[11]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[12]  Alok N. Choudhary,et al.  Forecasting Influenza Levels Using Real-Time Social Media Streams , 2017, 2017 IEEE International Conference on Healthcare Informatics (ICHI).

[13]  Naren Ramakrishnan,et al.  Syndromic surveillance of Flu on Twitter using weakly supervised temporal topic models , 2016, Data Mining and Knowledge Discovery.

[14]  Aron Culotta,et al.  Estimating county health statistics with twitter , 2014, CHI.

[15]  John S. Brownstein,et al.  Inferences about spatiotemporal variation in dengue virus transmission are sensitive to assumptions about human mobility: a case study using geolocated tweets from Lahore, Pakistan , 2018, EPJ Data Science.

[16]  Hyekyung Woo,et al.  Identification of Keywords From Twitter and Web Blog Posts to Detect Influenza Epidemics in Korea , 2017, Disaster Medicine and Public Health Preparedness.

[17]  Karin M. Verspoor,et al.  Towards Early Discovery of Salient Health Threats: A Social Media Emotion Classification Technique , 2016, PSB.

[18]  C E Winslow,et al.  THE UNTILLED FIELDS OF PUBLIC HEALTH. , 2017, Science.

[19]  Evan Dennison Livelo,et al.  Intelligent Dengue Infoveillance Using Gated Recurrent Neural Learning and Cross-Label Frequencies , 2018, 2018 IEEE International Conference on Agents (ICA).

[20]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.

[21]  J. Brownstein,et al.  Feasibility of using social media to monitor outdoor air pollution in London, England. , 2019, Preventive medicine.

[22]  Marwan Bikdash,et al.  Distance-based outliers method for detecting disease outbreaks using social media , 2016, SoutheastCon 2016.

[23]  H. Arksey,et al.  Scoping studies: towards a methodological framework , 2005 .

[24]  Soon Ae Chun,et al.  Enabling Real-Time Drug Abuse Detection in Tweets , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[25]  Lan Mu,et al.  Effect of climate and seasonality on depressed mood among twitter users , 2015 .

[26]  Hopin Lee,et al.  Tweeting back: predicting new cases of back pain with mass social media data , 2016, J. Am. Medical Informatics Assoc..

[27]  Elad Yom-Tov,et al.  Detecting Disease Outbreaks in Mass Gatherings Using Internet Data Monitoring , 2015 .

[28]  N. Heaivilin,et al.  Public Health Surveillance of Dental Pain via Twitter , 2011, Journal of dental research.

[29]  Hamman Samuel,et al.  Context Prediction in the Social Web Using Applied Machine Learning: A Study of Canadian Tweeters , 2018, 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[30]  Gert R. G. Lanckriet,et al.  Twitter-Based Detection of Illegal Online Sale of Prescription Opioid , 2017, American journal of public health.

[31]  Taehyung Wang,et al.  Social Network Data Mining Using Natural Language Processing and Density Based Clustering , 2014, 2014 IEEE International Conference on Semantic Computing.

[32]  Timothy B. Patrick,et al.  Social Media, Big Data, and Public Health Informatics: Ruminating Behavior of Depression Revealed through Twitter , 2015, 2015 48th Hawaii International Conference on System Sciences.

[33]  Wenli Zhang,et al.  Predicting Asthma-Related Emergency Department Visits Using Big Data , 2015, IEEE Journal of Biomedical and Health Informatics.

[34]  Sophia Ananiadou,et al.  Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts , 2016, J. Biomed. Informatics.

[35]  Ernesto Diaz-Aviles,et al.  Tracking Twitter for epidemic intelligence: case study: EHEC/HUS outbreak in Germany, 2011 , 2012, WebSci '12.

[36]  Ophir Frieder,et al.  Health-related hypothesis generation using social media data , 2015, Social Network Analysis and Mining.

[37]  Robert L Cook,et al.  Evaluating Google, Twitter, and Wikipedia as Tools for Influenza Surveillance Using Bayesian Change Point Analysis: A Comparative Analysis , 2016, JMIR public health and surveillance.

[38]  Christophe Giraud-Carrier,et al.  Epidemiology from Tweets: Estimating Misuse of Prescription Opioids in the USA from Social Media , 2017, Journal of Medical Toxicology.

[39]  Malika Mahoui,et al.  Social Media Sensing Framework for Population Health , 2019, 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC).

[40]  Eiji Aramaki,et al.  Conditional Density Estimation of Tweet Location: A Feature-Dependent Approach , 2017, MedInfo.

[41]  Helmut Leopold,et al.  Social Media , 2012, Elektrotech. Informationstechnik.

[42]  T. Mackey,et al.  Detection of illicit online sales of fentanyls via Twitter , 2017, F1000Research.

[43]  Ingemar J. Cox,et al.  On Infectious Intestinal Disease Surveillance using Social Media Content , 2016, Digital Health.

[44]  Richard Pebody,et al.  The added value of online user-generated content in traditional methods for influenza surveillance , 2018, Scientific Reports.

[45]  Wagner Meira,et al.  Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting Dengue at country and city level , 2017, PLoS neglected tropical diseases.

[46]  A. Rajić,et al.  A scoping review of scoping reviews: advancing the approach and enhancing the consistency , 2014, Research synthesis methods.

[47]  Alok N. Choudhary,et al.  Real-time disease surveillance using Twitter data: demonstration on flu and cancer , 2013, KDD.

[48]  Dilek Küçük,et al.  Ontology-based automatic identification of public health-related Turkish tweets , 2017, Comput. Biol. Medicine.

[49]  Samarth Swarup,et al.  Semantic network analysis of vaccine sentiment in online social media. , 2017, Vaccine.

[50]  Mei Han,et al.  City-Wide Influenza Forecasting based on Multi-Source Data , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[51]  Marina Riga,et al.  Investigating the Relationship between Social Media Content and Real-time Observations for Urban Air Quality and Public Health , 2014, WIMS '14.

[52]  Marwan Bikdash,et al.  From social media to public health surveillance: Word embedding based clustering method for twitter classification , 2017, SoutheastCon 2017.

[53]  R. Weiss,et al.  Using social media as a tool to predict syphilis. , 2017, Preventive medicine.

[54]  S. Natarajan,et al.  Public health allergy surveillance using micro-blogs , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[55]  Sung Hoon Lim,et al.  An unsupervised machine learning model for discovering latent infectious diseases using social media data , 2017, J. Biomed. Informatics.

[56]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[57]  Philip M. Massey,et al.  Applying Multiple Data Collection Tools to Quantify Human Papillomavirus Vaccine Communication on Twitter , 2016, Journal of medical Internet research.

[58]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[59]  Ireneus Kagashe,et al.  Enhancing Seasonal Influenza Surveillance: Topic Analysis of Widely Used Medicinal Drugs Using Twitter Data , 2017, Journal of medical Internet research.

[60]  D. Buckeridge,et al.  Comparing Twitter data to routine data sources in public health surveillance for the 2015 Pan/Parapan American Games: an ecological study , 2018, Canadian Journal of Public Health.

[61]  Richard Bonneau,et al.  Text Classification for Automatic Detection of E-Cigarette Use and Use for Smoking Cessation from Twitter: A Feasibility Pilot , 2016, PSB.

[62]  Zina Ben Miled,et al.  Digital Immunization Surveillance: Monitoring Flu Vaccination Rates Using Online Social Networks , 2017, 2017 IEEE 14th International Conference on Mobile Ad Hoc and Sensor Systems (MASS).

[63]  Scott H. Burton,et al.  Evaluating Social Media’s Capacity to Develop Engaged Audiences in Health Promotion Settings , 2013, Health Promotion Practice.

[64]  Ümit V. Çatalyürek,et al.  Syndromic Surveillance of Infectious Diseases meets Molecular Epidemiology in a Workflow and Phylogeographic Application , 2015, MedInfo.

[65]  Chandler McClellan,et al.  Using social media to monitor mental health discussions − evidence from Twitter , 2017, J. Am. Medical Informatics Assoc..

[66]  G. Hejblum,et al.  A systematic review of models for forecasting the number of emergency department visits , 2009, Emergency Medicine Journal.

[67]  Triple S Project Assessment of syndromic surveillance in Europe , 2011, The Lancet.

[68]  Liang Zhao,et al.  SimNest: Social Media Nested Epidemic Simulation via Online Semi-Supervised Deep Learning , 2015, 2015 IEEE International Conference on Data Mining.

[69]  Michael J. Paul,et al.  National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic , 2013, PloS one.

[70]  T Sasikala,et al.  Tracing out various diseases by analyzing Twitter data applying data mining techniques , 2017, 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS).

[71]  Ophir Frieder,et al.  A framework for detecting public health trends with Twitter , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[72]  Naren Ramakrishnan,et al.  Flu Gone Viral: Syndromic Surveillance of Flu on Twitter Using Temporal Topic Models , 2014, 2014 IEEE International Conference on Data Mining.

[73]  Ahmed Abdeen Hamed,et al.  T-Recs: Time-aware Twitter-based Drug Recommender System , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[74]  Lan Mu,et al.  GIS analysis of depression among Twitter users , 2015 .

[75]  Sihem Amer-Yahia,et al.  Health Monitoring on Social Media over Time , 2016, IEEE Transactions on Knowledge and Data Engineering.

[76]  Henry A. Kautz,et al.  Deploying nEmesis: Preventing Foodborne Illness by Data Mining Social Media , 2016, AI Mag..

[77]  Sanmay Das,et al.  Drugs or Dancing? Using Real-Time Machine Learning to Classify Streamed “Dabbing” Homograph Tweets , 2016, 2016 IEEE International Conference on Healthcare Informatics (ICHI).

[78]  Chris Hankin,et al.  DEFENDER: Detecting and Forecasting Epidemics Using Novel Data-Analytics for Enhanced Response , 2015, PloS one.

[79]  Haiyan Wang,et al.  Prediction of influenza-like illness based on the improved artificial tree algorithm and artificial neural network , 2018, Scientific Reports.

[80]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[81]  Yukiko Kawai,et al.  Twitter-Based Influenza Detection After Flu Peak via Tweets With Indirect Information: Text Mining Study , 2018, JMIR public health and surveillance.

[82]  Fan Yu,et al.  Towards large-scale twitter mining for drug-related adverse events , 2012, SHB '12.

[83]  Michael D. Barnes,et al.  Temporal variability of problem drinking on Twitter , 2012 .

[84]  Muhammad Imran,et al.  Classifying Information from Microblogs during Epidemics , 2017, DH.

[85]  Meera Gandhi,et al.  Earthquake Reporting System Development by Tweet Analysis with Approach Earthquake Alarm Systems , 2016 .

[86]  Todd J. Bodnar,et al.  Identifying Adverse Effects of HIV Drug Treatment and Associated Sentiments Using Twitter , 2015, JMIR public health and surveillance.

[87]  Kevin A Padrez,et al.  Twitter as a Tool for Health Research: A Systematic Review , 2017, American journal of public health.

[88]  Yanfang Ye,et al.  Adverse event detection by integrating twitter data and VAERS , 2018, Journal of Biomedical Semantics.

[89]  K. Suzanne Barber,et al.  Trust filter for disease surveillance: Identity , 2017, 2017 Intelligent Systems Conference (IntelliSys).

[90]  Chris Hankin,et al.  Real-time processing of social media with SENTINEL: A syndromic surveillance system incorporating deep learning for health classification , 2019, Inf. Process. Manag..

[91]  Vasudeva Varma,et al.  Semi-Supervised Recurrent Neural Network for Adverse Drug Reaction mention extraction , 2017, BMC Bioinformatics.

[92]  Patrick Breen,et al.  Mining Pre-Exposure Prophylaxis Trends in Social Media , 2016, 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[93]  Michael J. Paul,et al.  Using Social Media to Perform Local Influenza Surveillance in an Inner-City Hospital: A Retrospective Observational Study , 2015, JMIR public health and surveillance.

[94]  Virgílio A. F. Almeida,et al.  Dengue surveillance based on a computational model of spatio-temporal locality of Twitter , 2011, WebSci '11.

[95]  Suchendra M. Bhandarkar,et al.  A Deep Learning Paradigm for Detection of Harmful Algal Blooms , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[96]  Jihoon Jung,et al.  Social media responses to heat waves , 2017, International Journal of Biometeorology.

[97]  Degui Zhi,et al.  Tweeting about measles during stages of an outbreak: A semantic network approach to the framing of an emerging infectious disease , 2018, American Journal of Infection Control.

[98]  Graciela Gonzalez-Hernandez,et al.  Pharmacovigilance on Twitter? Mining Tweets for Adverse Drug Reactions , 2014, AMIA.

[99]  A. Rasin,et al.  Using Real-Time Social Media Technologies to Monitor Levels of Perceived Stress and Emotional State in College Students: A Web-Based Questionnaire Study , 2017, JMIR mental health.

[100]  Haiyan Wang,et al.  Regional Level Influenza Study with Geo-Tagged Twitter Data , 2016, Journal of Medical Systems.

[101]  Ronaldo Menezes,et al.  Mining location information from users' spatio-temporal data , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[102]  Hui Zhao,et al.  Detecting Flu Transmission by Social Sensor in China , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[103]  Yanfang Ye,et al.  Semi-supervised Multi-instance Interpretable Models for Flu Shot Adverse Event Detection , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[104]  Benyuan Liu,et al.  Predicting Flu Trends using Twitter data , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[105]  Rok Sosic,et al.  Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis , 2018, JMIR public health and surveillance.

[106]  Melody Moh,et al.  Efficient adverse drug event extraction using Twitter sentiment analysis , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[107]  Bechara Choucair,et al.  Health Department Use of Social Media to Identify Foodborne Illness — Chicago, Illinois, 2013–2014 , 2014, MMWR. Morbidity and mortality weekly report.