Automatically Appraising the Credibility of Vaccine-Related Web Pages Shared on Social Media: A Twitter Surveillance Study

Background Tools used to appraise the credibility of health information are time-consuming to apply and require context-specific expertise, limiting their use for quickly identifying and mitigating the spread of misinformation as it emerges. Objective The aim of this study was to estimate the proportion of vaccine-related Twitter posts linked to Web pages of low credibility and measure the potential reach of those posts. Methods Sampling from 143,003 unique vaccine-related Web pages shared on Twitter between January 2017 and March 2018, we used a 7-point checklist adapted from validated tools and guidelines to manually appraise the credibility of 474 Web pages. These were used to train several classifiers (random forests, support vector machines, and recurrent neural networks) using the text from a Web page to predict whether the information satisfies each of the 7 criteria. Estimating the credibility of all other Web pages, we used the follower network to estimate potential exposures relative to a credibility score defined by the 7-point checklist. Results The best-performing classifiers were able to distinguish between low, medium, and high credibility with an accuracy of 78% and labeled low-credibility Web pages with a precision of over 96%. Across the set of unique Web pages, 11.86% (16,961 of 143,003) were estimated as low credibility and they generated 9.34% (1.64 billion of 17.6 billion) of potential exposures. The 100 most popular links to low credibility Web pages were each potentially seen by an estimated 2 million to 80 million Twitter users globally. Conclusions The results indicate that although a small minority of low-credibility Web pages reach a large audience, low-credibility Web pages tend to reach fewer users than other Web pages overall and are more commonly shared within certain subpopulations. An automatic credibility appraisal tool may be useful for finding communities of users at higher risk of exposure to low-credibility vaccine communications.

[1]  G. Caldarelli,et al.  The spreading of misinformation online , 2016, Proceedings of the National Academy of Sciences.

[2]  K. Mandl,et al.  Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States. , 2017, Vaccine.

[3]  Isabelle Boutron,et al.  Factors associated with online media attention to research: a cohort study of articles evaluating cancer treatments , 2017, Research integrity and peer review.

[4]  K. Mandl,et al.  Associations Between Exposure to and Expression of Negative Opinions About Human Papillomavirus Vaccines on Social Media: An Observational Study , 2015, Journal of medical Internet research.

[5]  Heidi J. Larson,et al.  The biggest pandemic risk? Viral misinformation , 2018, Nature.

[6]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[7]  L. Bode,et al.  See Something, Say Something: Correction of Global Health Misinformation on Social Media , 2018, Health communication.

[8]  Anand K. Gramopadhye,et al.  Healthcare information on YouTube: A systematic review , 2015, Health Informatics J..

[9]  Isabelle Boutron,et al.  Misrepresentation of Randomized Controlled Trials in Press Releases and News Coverage: A Cohort Study , 2012, PLoS medicine.

[10]  James B. Weaver,et al.  Healthcare non-adherence decisions and internet health information , 2009, Comput. Hum. Behav..

[11]  Petroc Sumner,et al.  The association between exaggeration in health related science news and academic press releases: retrospective observational study , 2014, BMJ : British Medical Journal.

[12]  Qian Zhang,et al.  Collective attention in the age of (mis)information , 2014, Comput. Hum. Behav..

[13]  Chris J. Vargo,et al.  Geographic and demographic correlates of autism-related anti-vaccine beliefs on Twitter, 2009-15. , 2017, Social science & medicine.

[14]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[15]  Soroush Vosoughi,et al.  Rumor Gauge , 2017, ACM Trans. Knowl. Discov. Data.

[16]  S B Soumerai,et al.  Coverage by the news media of the benefits and risks of medications. , 2000, The New England journal of medicine.

[17]  Mark Dredze,et al.  Understanding Vaccine Refusal: Why We Need Social Media Now. , 2016, American journal of preventive medicine.

[18]  Filippo Menczer,et al.  Fact-checking Effect on Viral Hoaxes: A Model of Misinformation Spread in Social Networks , 2015, WWW.

[19]  David G. Rand,et al.  Prior Exposure Increases Perceived Accuracy of Fake News , 2018, Journal of experimental psychology. General.

[20]  Wei Gao,et al.  Detecting Rumors from Microblogs with Recurrent Neural Networks , 2016, IJCAI.

[21]  Marcel Salathé,et al.  Assessing Vaccination Sentiments with Online Social Media: Implications for Infectious Disease Dynamics and Control , 2011, PLoS Comput. Biol..

[22]  E. Coiera Information epidemics, economics, and immunity on the internet , 1998 .

[23]  D Charnock,et al.  DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. , 1999, Journal of epidemiology and community health.

[24]  Enrico Coiera,et al.  Prevalence of Disclosed Conflicts of Interest in Biomedical Research and Associations With Journal Impact Factors and Altmetric Scores , 2018, JAMA.

[25]  Filippo Menczer,et al.  Hoaxy: A Platform for Tracking Online Misinformation , 2016, WWW.

[26]  Qiaozhu Mei,et al.  Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts , 2015, WWW.

[27]  Eric Gilbert,et al.  A Parsimonious Language Model of Social Media Credibility Across Disparate Events , 2017, CSCW.

[28]  J. Hirsh,et al.  The development and validation of an instrument to measure the quality of health research reports in the lay media , 2017, BMC Public Health.

[29]  David A. Broniatowski,et al.  Weaponized Health Communication: Twitter Bots and Russian Trolls Amplify the Vaccine Debate , 2018, American journal of public health.

[30]  D. Lazer,et al.  Fake news on Twitter during the 2016 U.S. presidential election , 2019, Science.

[31]  Adam G. Dunn,et al.  Meeting the challenges of reporting on public health in the new media landscape , 2017 .

[32]  Xiaoli Nan,et al.  When Vaccines Go Viral: An Analysis of HPV Vaccine Coverage on YouTube , 2012, Health communication.

[33]  Ullrich K. H. Ecker,et al.  Misinformation and Its Correction , 2012, Psychological science in the public interest : a journal of the American Psychological Society.

[34]  Elmer V. Bernstam,et al.  Instruments to assess the quality of health information on the World Wide Web: what can our patients actually use? , 2005, Int. J. Medical Informatics.

[35]  Christian Köhler,et al.  How do consumers search for and appraise health information on the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews , 2002, BMJ : British Medical Journal.

[36]  Nilay Kumar,et al.  Greater freedom of speech on Web 2.0 correlates with dominance of views linking vaccines to autism. , 2015, Vaccine.

[37]  Emily K. Vraga,et al.  Using Expert Sources to Correct Health Misinformation in Social Media , 2017 .

[38]  Georgina Kennedy,et al.  Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection , 2016, Journal of medical Internet research.

[39]  Jingcheng Du,et al.  Leveraging deep learning to understand health beliefs about the Human Papillomavirus Vaccine from social media , 2019, npj Digital Medicine.

[40]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[41]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[42]  Miriam J. Metzger,et al.  The science of fake news , 2018, Science.

[43]  Jan Kowalski,et al.  The association between quality measures of medical university press releases and their corresponding news stories—Important information missing , 2019, PloS one.

[44]  Jingcheng Du,et al.  Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models , 2018, Journal of medical Internet research.

[45]  Enrico Coiera,et al.  Social media interventions for precision public health: promises and risks , 2018, npj Digital Medicine.

[46]  Jure Leskovec,et al.  Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes , 2016, WWW.

[47]  Divyakant Agrawal,et al.  Limiting the spread of misinformation in social networks , 2011, WWW.

[48]  Kar-Hai Chu,et al.  Toward Real-Time Infoveillance of Twitter Health Messages. , 2018, American journal of public health.

[49]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[50]  G. Schwitzer A Review of Features in Internet Consumer Health Decision-support Tools , 2002, Journal of medical Internet research.