COVID-19 Sensing: Negative Sentiment Analysis on Social Media in China via BERT Model

Coronavirus disease 2019 (COVID-19) poses massive challenges for the world. Public sentiment analysis during the outbreak provides insightful information in making appropriate public health responses. On Sina Weibo, a popular Chinese social media, posts with negative sentiment are valuable in analyzing public concerns. 999,978 randomly selected COVID-19 related Weibo posts from 1 January 2020 to 18 February 2020 are analyzed. Specifically, the unsupervised BERT (Bidirectional Encoder Representations from Transformers) model is adopted to classify sentiment categories (positive, neutral, and negative) and TF-IDF (term frequency-inverse document frequency) model is used to summarize the topics of posts. Trend analysis and thematic analysis are conducted to identify characteristics of negative sentiment. In general, the fine-tuned BERT conducts sentiment classification with considerable accuracy. Besides, topics extracted by TF-IDF precisely convey characteristics of posts regarding COVID-19. As a result, we observed that people concern four aspects regarding COVID-19, the virus Origin (Gamey Food, 3.08%; Bat, 2.70%; Conspiracy Theory, 1.43%), Symptom (Fever, 2.13%; Cough, 1.19%), Production Activity (Go to Work, 1.94%; Resume Work, 1.12%; School New Semester Beginning, 1.06%) and Public Health Control (Temperature Taking, 1.39%; Coronavirus Cover-up, 1.26%; City Shutdown, 1.09%). Results from Weibo posts provide constructive instructions on public health responses, that transparent information sharing and scientific guidance might help alleviate public concerns.

[1]  Ian Witten,et al.  Data Mining , 2000 .

[2]  M. Toole,et al.  Evolution of complex disasters , 1995, The Lancet.

[3]  V. Chang,et al.  At-a-glance - What can social media tell us about the opioid crisis in Canada? , 2018, Health promotion and chronic disease prevention in Canada : research, policy and practice.

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[6]  Shen Tian,et al.  Full spectrum of COVID-19 severity still being depicted , 2020, The Lancet.

[7]  Gerjo Kok,et al.  Disease Detection or Public Opinion Reflection? Content Analysis of Tweets, Other Social Media, and Online Newspapers During the Measles Outbreak in the Netherlands in 2013 , 2015, Journal of medical Internet research.

[8]  W. Ko,et al.  Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges , 2020, International Journal of Antimicrobial Agents.

[9]  Yunan Chen,et al.  Managing Uncertainty: Using Social Media for Risk Assessment during a Public Health Crisis , 2017, CHI.

[10]  Yonghong Xiao,et al.  Taking the right measures to control COVID-19 , 2020, The Lancet Infectious Diseases.

[11]  Alexander M. Rush,et al.  Structured Attention Networks , 2017, ICLR.

[12]  José Gabriel Pereira Lopes,et al.  A Document Descriptor Extractor Based on Relevant Expressions , 2009, EPIA.

[13]  Jiyuan Zhang,et al.  Pathological findings of COVID-19 associated with acute respiratory distress syndrome , 2020, The Lancet Respiratory Medicine.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Guang Yu,et al.  A new method for early detection of mass concern about public health issues , 2017 .

[16]  Yue Zhang,et al.  Context-Sensitive Twitter Sentiment Classification Using Neural Network , 2016, AAAI.

[17]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[18]  L. Yang,et al.  Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak , 2020, International Journal of Infectious Diseases.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20]  Soon Ae Chun,et al.  Twitter sentiment classification for measuring public health concerns , 2015, Social Network Analysis and Mining.

[21]  K. Hashimoto,et al.  Vicarious traumatization in the general public, members, and non-members of medical teams aiding in COVID-19 control , 2020, Brain, Behavior, and Immunity.

[22]  Christian Drosten,et al.  Statement in support of the scientists, public health professionals, and medical professionals of China combatting COVID-19 , 2020, The Lancet.

[23]  Bin Lin,et al.  Sentiment classification for Chinese reviews: a comparison between SVM and semantic approaches , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[24]  J. Rocklöv,et al.  The reproductive number of COVID-19 is higher compared to SARS coronavirus , 2020, Journal of travel medicine.

[25]  Vivek Narayanan,et al.  Fast and Accurate Sentiment Classification Using an Enhanced Naive Bayes Model , 2013, IDEAL.