Revealing the Global Linguistic and Geographical Disparities of Public Awareness to Covid-19 Outbreak through Social Media

The Covid-19 has presented an unprecedented challenge to public health worldwide. However, residents in different countries showed diverse levels of Covid-19 awareness during the outbreak and suffered from uneven health impacts. This study analyzed the global Twitter data from January 1st to June 30th, 2020, seeking to answer two research questions. What are the linguistic and geographical disparities of public awareness in the Covid-19 outbreak period reflected on social media? Can the changing pandemic awareness predict the Covid-19 outbreak? We established a Twitter data mining framework calculating the Ratio index to quantify and track the awareness. The lag correlations between awareness and health impacts were examined at global and country levels. Results show that users presenting the highest Covid-19 awareness were mainly those tweeting in the official languages of India and Bangladesh. Asian countries showed more significant disparities in awareness than European countries, and awareness in the eastern part of Europe was higher than in central Europe. Finally, the Ratio index could accurately predict global mortality rate, global case fatality ratio, and country-level mortality rate, with 21-30, 35-42, and 17 leading days, respectively. This study yields timely insights into social media use in understanding human behaviors for public health research.

[1]  Yun Kang,et al.  Regional Influenza Prediction with Sampling Twitter Data and PDE Model , 2020, International journal of environmental research and public health.

[2]  Catey Bunce,et al.  World leaders’ usage of Twitter in response to the COVID-19 pandemic: a content analysis , 2020, Journal of public health.

[3]  Eiji Aramaki,et al.  Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction , 2016, COLING.

[4]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[5]  Arkaitz Zubiaga,et al.  Towards Real-Time, Country-Level Location Classification of Worldwide Tweets , 2016, IEEE Transactions on Knowledge and Data Engineering.

[6]  Hideo Hirose,et al.  Prediction of Infectious Disease Spread Using Twitter: A Case of Influenza , 2012, 2012 Fifth International Symposium on Parallel Architectures, Algorithms and Programming.

[7]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[8]  Kia Jahanbin,et al.  Using twitter and web news mining to predict COVID-19 outbreak , 2020 .

[9]  Yinping Yang,et al.  Global Sentiments Surrounding the COVID-19 Pandemic on Twitter: Analysis of Twitter Trends , 2020, JMIR public health and surveillance.

[10]  S. Diallo,et al.  You Are What You Tweet: Connecting the Geographic Variation in America’s Obesity Rate to Twitter Content , 2015, PloS one.

[11]  Jiaoyan Chen,et al.  Forecasting smog-related health hazard based on social media and physical sensor , 2016, Information Systems.

[12]  Quy V. Khuc,et al.  Policy Response, Social Media and Science Journalism for the Sustainability of the Public Health System Amid the COVID-19 Outbreak: The Vietnam Lessons , 2020, Sustainability.

[13]  Michael J. Paul,et al.  Discovering Health Topics in Social Media Using Topic Models , 2014, PloS one.

[14]  Alexander J. Rothman,et al.  Americans' perceptions of disparities in COVID-19 mortality: Results from a nationally-representative survey , 2020, Preventive Medicine.

[15]  More effective strategies are required to strengthen public awareness of COVID-19: Evidence from Google Trends , 2020, Journal of global health.

[16]  Paulo Cortez,et al.  Twitter user geolocation using web country noun searches , 2019, Decis. Support Syst..

[17]  Aron Culotta,et al.  Estimating county health statistics with twitter , 2014, CHI.

[18]  Md Mahbub Hossain,et al.  Impact of Rumors and Misinformation on COVID-19 in Social Media , 2020, Journal of preventive medicine and public health = Yebang Uihakhoe chi.

[19]  E. Cvejic,et al.  Disparities in COVID-19 related knowledge, attitudes, beliefs and behaviours by health literacy , 2020, medRxiv.

[20]  Qiang Sun,et al.  Prediction of Number of Cases of 2019 Novel Coronavirus (COVID-19) Using Social Media Search Index , 2020, International journal of environmental research and public health.

[21]  J. Brownstein,et al.  Characterizing Sleep Issues Using Twitter , 2015, Journal of medical Internet research.

[22]  Yi Wang,et al.  Mental health problems and social media exposure during COVID-19 outbreak , 2020, PloS one.

[23]  Fareed Zaffar,et al.  Towards Characterizing COVID-19 Awareness on Twitter , 2020, ArXiv.

[24]  Kristina Lerman,et al.  Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set , 2020, JMIR public health and surveillance.

[25]  Lei Zou,et al.  Mining Twitter Data for Improved Understanding of Disaster Resilience , 2018 .

[26]  Claire Cardie,et al.  Early Stage Influenza Detection from Twitter , 2013, ArXiv.

[27]  Timothy R. Tangherlini,et al.  Conspiracy in the Time of Corona: Automatic detection of Covid-19 Conspiracy Theories in Social Media and the News , 2020, ArXiv.

[28]  Ahmad Alhindi,et al.  Large Arabic Twitter Dataset on COVID-19 , 2020, ArXiv.

[29]  Alessandro Vespignani,et al.  The Twitter of Babel: Mapping World Languages through Microblogging Platforms , 2012, PloS one.

[30]  Seungwon Yang,et al.  Social and geographical disparities in Twitter use during Hurricane Harvey , 2018, Int. J. Digit. Earth.

[31]  Christian E. Lopez,et al.  Understanding the perception of COVID-19 policies by mining a multilanguage Twitter dataset , 2020, ArXiv.

[32]  Aron Culotta,et al.  Detecting influenza outbreaks by analyzing Twitter messages , 2010, ArXiv.

[33]  Ming-Hsiang Tsou,et al.  Applying GIS and Machine Learning Methods to Twitter Data for Multiscale Surveillance of Influenza , 2016, PloS one.

[34]  Attila Kiss,et al.  Social media sentiment analysis based on COVID-19 , 2020, J. Inf. Telecommun..

[35]  Xin-qi Zheng,et al.  Analysis of spatiotemporal characteristics of big data on social media sentiment with COVID-19 epidemic topics , 2020, Chaos, Solitons & Fractals.

[36]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[37]  E. Dong,et al.  An interactive web-based dashboard to track COVID-19 in real time , 2020, The Lancet Infectious Diseases.

[38]  Li Jia Chen,et al.  Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China, 2020 , 2020, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[39]  J. Brownstein,et al.  Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. , 2012, The American journal of tropical medicine and hygiene.

[40]  Philip Mai,et al.  Going viral: How a single tweet spawned a COVID-19 conspiracy theory on Twitter , 2020, Big Data Soc..

[41]  Lei Zou,et al.  TWITTER USE IN HURRICANE ISAAC AND ITS IMPLICATIONS TO DISASTER RESILIENCE , 2019 .

[42]  Jiebo Luo,et al.  In the Eyes of the Beholder: Analyzing Social Media Use of Neutral and Controversial Terms for COVID-19 , 2020 .

[43]  Ramesh Sharda,et al.  Social Media for Nowcasting Flu Activity: Spatio-Temporal Big Data Analysis , 2019, Information Systems Frontiers.

[44]  Nan Wang,et al.  Local spatial obesity analysis and estimation using online social network sensors , 2018, J. Biomed. Informatics.

[45]  Scott A. Hale,et al.  Where in the World Are You? Geolocation and Language Identification in Twitter* , 2013, ArXiv.