Drink bleach or do what now? Covid-HeRA: A dataset for risk-informed health decision making in the presence of COVID19 misinformation

Given the wide spread of inaccurate medical advice related to the 2019 coronavirus pandemic (COVID-19), such as fake remedies, treatments and prevention suggestions, misinformation detection has emerged as an open problem of high importance and interest for the NLP community. To combat potential harm of COVID19-related misinformation, we release Covid-HeRA, a dataset for health risk assessment of COVID-19-related social media posts. More specifically, we study the severity of each misinformation story, i.e., how harmful a message believed by the audience can be and what type of signals can be used to discover high malicious fake news and detect refuted claims. We present a detailed analysis, evaluate several simple and advanced classification models, and conclude with our experimental analysis that presents open challenges and future directions.

[1]  Jack Sheppard,et al.  The impacts of the novel SARS-CoV-2 outbreak on surgical oncology - A letter to the editor on “The socio-economic implications of the coronavirus and COVID-19 pandemic: A review” , 2020, International Journal of Surgery.

[2]  Soon Ae Chun,et al.  Monitoring Public Health Concerns Using Twitter Sentiment Classifications , 2013, 2013 IEEE International Conference on Healthcare Informatics.

[3]  Yongdong Zhang,et al.  News Verification by Exploiting Conflicting Social Viewpoints in Microblogs , 2016, AAAI.

[4]  Hari Sundaram,et al.  CrowdQM: Learning Aspect-Level User Reliability and Comment Trustworthiness in Discussion Forums , 2020, PAKDD.

[5]  Samara Perez,et al.  Beliefs, behaviors and HPV vaccine: correcting the myths and the misinformation. , 2013, Preventive medicine.

[6]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[7]  Jabra Zarka,et al.  Coronavirus Goes Viral: Quantifying the COVID-19 Misinformation Epidemic on Twitter , 2020, Cureus.

[8]  Huan Liu,et al.  dEFEND: Explainable Fake News Detection , 2019, KDD.

[9]  Tobias Preis,et al.  Adaptive nowcasting of influenza outbreaks using Google searches , 2014, Royal Society Open Science.

[10]  Kevin Driscoll,et al.  The diffusion of misinformation on social media: Temporal pattern, message, and source , 2018, Comput. Hum. Behav..

[11]  Ritam Dutt,et al.  Analysing the Extent of Misinformation in Cancer Related Tweets , 2020, ICWSM.

[12]  ChengXiang Zhai,et al.  Hotspots of news articles: Joint mining of news text & social media to discover controversial points in news , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[13]  Sasikiran Kandula,et al.  Reappraising the utility of Google Flu Trends , 2019, PLoS Comput. Biol..

[14]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[15]  Delal Dara Kılınç,et al.  Assessment of Reliability of YouTube Videos on Orthodontics. , 2019, Turkish journal of orthodontics.

[16]  Muhammad Ashad Kabir,et al.  Differences in Health News from Reliable and Unreliable Media , 2019, WWW.

[17]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[18]  Amir Ebrahimi Fard,et al.  Misinformation Battle Revisited: Counter Strategies from Clinics to Artificial Intelligence , 2020, WWW.

[19]  David G. Rand,et al.  Fighting COVID-19 Misinformation on Social Media: Experimental Evidence for a Scalable Accuracy-Nudge Intervention , 2020, Psychological science.

[20]  Aron Culotta,et al.  Towards detecting influenza epidemics by analyzing Twitter messages , 2010, SOMA '10.

[21]  Mark Dredze,et al.  Examining Patterns of Influenza Vaccination in Social Media , 2017, AAAI Workshops.

[22]  Cheng-Te Li,et al.  GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media , 2020, ACL.

[23]  Mark Steedman,et al.  Example Selection for Bootstrapping Statistical Parsers , 2003, NAACL.

[24]  Marcel Salathé,et al.  COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter , 2020, Frontiers in Artificial Intelligence.

[25]  Chuan Yu,et al.  Trends in the diffusion of misinformation on social media , 2018, Research & Politics.

[26]  Christian Drosten,et al.  Statement in support of the scientists, public health professionals, and medical professionals of China combatting COVID-19 , 2020, The Lancet.

[27]  Mahbub Hossain,et al.  Impact of rumors or misinformation on coronavirus disease (COVID-19) in social media , 2020 .

[28]  Christopher J. Tignanelli,et al.  Fact Versus Science Fiction: Fighting Coronavirus Disease 2019 Requires the Wisdom to Know the Difference , 2020, Critical care explorations.

[29]  R. J. Hunt,et al.  Percent Agreement, Pearson's Correlation, and Kappa as Measures of Inter-examiner Reliability , 1986, Journal of dental research.

[30]  Limeng Cui,et al.  CoAID: COVID-19 Healthcare Misinformation Dataset , 2020, ArXiv.

[31]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[32]  Shomik Sengupta,et al.  Dissemination of Misinformative and Biased Information about Prostate Cancer on YouTube. , 2019, European urology.

[33]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[34]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[35]  Shujhat Khan,et al.  Coronavirus: the spread of misinformation , 2020, BMC Medicine.

[36]  Chris J. Vargo,et al.  Geographic and demographic correlates of autism-related anti-vaccine beliefs on Twitter, 2009-15. , 2017, Social science & medicine.

[37]  Mehdi Jalalpour,et al.  Google Flu Trends Spatial Variability Validated Against Emergency Department Influenza-Related Visits , 2016, Journal of medical Internet research.

[38]  Eric Baumer,et al.  Speaking on Behalf of: Representation, Delegation, and Authority in Computational Text Analysis , 2019, AIES.

[39]  Fenglong Ma,et al.  Weak Supervision for Fake News Detection via Reinforcement Learning , 2019, AAAI.

[40]  Z. Fayad,et al.  CT Imaging Features of 2019 Novel Coronavirus (2019-nCoV) , 2020, Radiology.

[41]  Jose Yunam Cuan-Baltazar,et al.  Misinformation of COVID-19 on the Internet: Infodemiology Study , 2020, JMIR Public Health and Surveillance.

[42]  L. Bode,et al.  See Something, Say Something: Correction of Global Health Misinformation on Social Media , 2018, Health communication.

[43]  L. Garrett COVID-19: the medium is the message , 2020, The Lancet.

[44]  Juliana Freire,et al.  A Topic-Agnostic Approach for Identifying Fake News Pages , 2019, WWW.

[45]  Harith Alani,et al.  Misinformation : Challenges and Future Directions Conference or Workshop Item , 2018 .

[46]  Emily K. Vraga,et al.  A first look at COVID-19 information and misinformation sharing on Twitter , 2020, ArXiv.

[47]  Maximilian Mozes,et al.  Measuring Emotions in the COVID-19 Real World Worry Dataset , 2020, NLPCOVID19.

[48]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[49]  Matteo Cinelli,et al.  The COVID-19 social media infodemic , 2020, Scientific reports.

[50]  Cecile Paris,et al.  Shot Or Not: Comparison of NLP Approaches for Vaccination Behaviour Detection , 2018, EMNLP 2018.

[51]  Yelena Mejova,et al.  Fake Cures: User-centric Modeling of Health Misinformation in Social Media , 2018 .

[52]  Michael J. Paul,et al.  Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018 , 2018, EMNLP 2018.

[53]  M. Santillana,et al.  What can digital disease detection learn from (an external revision to) Google Flu Trends? , 2014, American journal of preventive medicine.

[54]  Md Mahbub Hossain,et al.  Impact of Rumors and Misinformation on COVID-19 in Social Media , 2020, Journal of preventive medicine and public health = Yebang Uihakhoe chi.