Public discourse and sentiment during the COVID 19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter

The study aims to understand Twitter users’ discourse and psychological reactions to COVID-19. We use machine learning techniques to analyze about 1.9 million Tweets (written in English) related to coronavirus collected from January 23 to March 7, 2020. A total of salient 11 topics are identified and then categorized into ten themes, including “updates about confirmed cases,” “COVID-19 related death,” “cases outside China (worldwide),” “COVID-19 outbreak in South Korea,” “early signs of the outbreak in New York,” “Diamond Princess cruise,” “economic impact,” “Preventive measures,” “authorities,” and “supply chain.” Results do not reveal treatments and symptoms related messages as prevalent topics on Twitter. Sentiment analysis shows that fear for the unknown nature of the coronavirus is dominant in all topics. Implications and limitations of the study are also discussed.

[1]  Michael J. Paul,et al.  Discovering Health Topics in Social Media Using Topic Models , 2014, PloS one.

[2]  Alberto M. Segre,et al.  Using Twitter to Estimate H1N1 Influenza Activity , 2010 .

[3]  R. Plutchik A GENERAL PSYCHOEVOLUTIONARY THEORY OF EMOTION , 1980 .

[4]  Chen Chen,et al.  Twitter discussions and concerns about COVID-19 pandemic: Twitter data analysis using a machine learning approach , 2020, ArXiv.

[5]  Nandita Mitra,et al.  Using Social Media to Track Geographic Variability in Language About Diabetes: Analysis of Diabetes-Related Tweets Across the United States. , 2020, JMIR diabetes.

[6]  J. Xue,et al.  The Impact of COVID-19 Epidemic Declaration on Psychological Consequences: A Study on Active Weibo Users , 2020, International journal of environmental research and public health.

[7]  Janez Demšar,et al.  Emotion Recognition on Twitter: Comparative Study and Training a Unison Model , 2020, IEEE Transactions on Affective Computing.

[8]  Alexandra Georgakopoulou-Nunes,et al.  The Sage Handbook of Social Media Research Methods , 2017 .

[9]  Ting Yu,et al.  Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study , 2020, The Lancet.

[10]  Ghazaleh Beigi,et al.  An Overview of Sentiment Analysis in Social Media and Its Applications in Disaster Relief , 2016, Sentiment Analysis and Ontology Engineering.

[11]  J. Jones,et al.  Early Assessment of Anxiety and Behavioral Response to Novel Swine-Origin Influenza A(H1N1) , 2009, PloS one.

[12]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[13]  R. Eggo,et al.  Effectiveness of airport screening at detecting travellers infected with novel coronavirus (2019-nCoV) , 2020, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Yinping Yang,et al.  Global Sentiments Surrounding the COVID-19 Pandemic on Twitter: Analysis of Twitter Trends , 2020, JMIR public health and surveillance.

[16]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[17]  Jang Hyun Kim,et al.  Using photos for public health communication: A computational analysis of the Centers for Disease Control and Prevention Instagram photos and public responses , 2020, Health Informatics J..

[18]  Dhiraj Murthy,et al.  The Ontology of Tweets: Mixed-Method Approaches to the Study of Twitter , 2016 .

[19]  Jieliang Chen,et al.  Pathogenicity and transmissibility of 2019-nCoV—A quick overview and comparison with other emerging viruses , 2020, Microbes and Infection.

[20]  T. Mackey,et al.  Machine Learning to Detect Self-Reporting of Symptoms, Testing Access, and Recovery Associated With COVID-19 on Twitter: Retrospective Big Data Infoveillance Study , 2020, JMIR public health and surveillance.

[21]  V. Braun,et al.  Using thematic analysis in psychology , 2006 .

[22]  Using Social Media to Track Geographic Variability in Language About Diabetes: Infodemiology Analysis , 2020, JMIR Diabetes.

[23]  Jeffrey Heer,et al.  Interpretation and trust: designing model-driven visualizations for text analysis , 2012, CHI.

[24]  Tingshao Zhu,et al.  Examining the Impact of COVID-19 Lockdown in Wuhan and Lombardy: A Psycholinguistic Analysis on Weibo and Twitter , 2020, International journal of environmental research and public health.

[25]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.