Automatically applying a credibility appraisal tool to track vaccination-related communications shared on social media

Background: Tools used to appraise the credibility of health information are time-consuming to apply and require context-specific expertise, limiting their use for quickly identifying and mitigating the spread of misinformation as it emerges. Our aim was to estimate the proportion of vaccination-related posts on Twitter are likely to be misinformation, and how unevenly exposure to misinformation was distributed among Twitter users. Methods: Sampling from 144,878 vaccination-related web pages shared on Twitter between January 2017 and March 2018, we used a seven-point checklist adapted from two validated tools to appraise the credibility of a small subset of 474. These were used to train several classifiers (random forest, support vector machines, and a recurrent neural network with transfer learning), using the text from a web page to predict whether the information satisfies each of the seven criteria. Results: Applying the best performing classifier to the 144,878 web pages, we found that 14.4\% of relevant posts to text-based communications were linked to webpages of low credibility and made up 9.2\% of all potential vaccination-related exposures. However, the 100 most popular links to misinformation were potentially seen by between 2 million and 80 million Twitter users, and for a substantial sub-population of Twitter users engaging with vaccination-related information, links to misinformation appear to dominate the vaccination-related information to which they were exposed. Conclusions: We proposed a new method for automatically appraising the credibility of webpages based on a combination of validated checklist tools. The results suggest that an automatic credibility appraisal tool can be used to find populations at higher risk of exposure to misinformation or applied proactively to add friction to the sharing of low credibility vaccination information.

[1]  D Charnock,et al.  DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. , 1999, Journal of epidemiology and community health.

[2]  L. Korzeniowski [On autism]. , 1967, Annales medico-psychologiques.

[3]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[4]  K. Mandl,et al.  Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States. , 2017, Vaccine.

[5]  Richard Socher,et al.  Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[6]  Elmer V. Bernstam,et al.  Instruments to assess the quality of health information on the World Wide Web: what can our patients actually use? , 2005, Int. J. Medical Informatics.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  K. Mandl,et al.  Associations Between Exposure to and Expression of Negative Opinions About Human Papillomavirus Vaccines on Social Media: An Observational Study , 2015, Journal of medical Internet research.

[9]  Soroush Vosoughi,et al.  Rumor Gauge , 2017, ACM Trans. Knowl. Discov. Data.

[10]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[11]  Filippo Menczer,et al.  Fact-checking Effect on Viral Hoaxes: A Model of Misinformation Spread in Social Networks , 2015, WWW.

[12]  Nilay Kumar,et al.  Greater freedom of speech on Web 2.0 correlates with dominance of views linking vaccines to autism. , 2015, Vaccine.

[13]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[14]  Miriam J. Metzger,et al.  The science of fake news , 2018, Science.

[15]  Wei Gao,et al.  Detecting Rumors from Microblogs with Recurrent Neural Networks , 2016, IJCAI.

[16]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[17]  Jure Leskovec,et al.  Do Diffusion Protocols Govern Cascade Growth? , 2018, ICWSM.

[18]  Enrico Coiera,et al.  Social media interventions for precision public health: promises and risks , 2018, npj Digital Medicine.

[19]  Emily K. Vraga,et al.  Using Expert Sources to Correct Health Misinformation in Social Media , 2017 .

[20]  Georgina Kennedy,et al.  Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection , 2016, Journal of medical Internet research.

[21]  D. Lazer,et al.  Fake news on Twitter during the 2016 U.S. presidential election , 2019, Science.

[22]  Jure Leskovec,et al.  Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes , 2016, WWW.

[23]  Divyakant Agrawal,et al.  Limiting the spread of misinformation in social networks , 2011, WWW.

[24]  Filippo Menczer,et al.  Hoaxy: A Platform for Tracking Online Misinformation , 2016, WWW.

[25]  L. Bode,et al.  See Something, Say Something: Correction of Global Health Misinformation on Social Media , 2018, Health communication.

[26]  Anand K. Gramopadhye,et al.  Healthcare information on YouTube: A systematic review , 2015, Health Informatics J..

[27]  Xiaoli Nan,et al.  When Vaccines Go Viral: An Analysis of HPV Vaccine Coverage on YouTube , 2012, Health communication.

[28]  E. Coiera Information epidemics, economics, and immunity on the internet , 1998 .

[29]  Qiaozhu Mei,et al.  Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts , 2015, WWW.

[30]  Eric Gilbert,et al.  A Parsimonious Language Model of Social Media Credibility Across Disparate Events , 2017, CSCW.

[31]  J. Hirsh,et al.  The development and validation of an instrument to measure the quality of health research reports in the lay media , 2017, BMC Public Health.

[32]  Qian Zhang,et al.  Collective attention in the age of (mis)information , 2014, Comput. Hum. Behav..