Identifying and Ranking Common COVID-19 Symptoms From Tweets in Arabic: Content Analysis

Background A substantial amount of COVID-19–related data is generated by Twitter users every day. Self-reports of COVID-19 symptoms on Twitter can reveal a great deal about the disease and its prevalence in the community. In particular, self-reports can be used as a valuable resource to learn more about common symptoms and whether their order of appearance differs among different groups in the community. These data may be used to develop a COVID-19 risk assessment system that is tailored toward a specific group of people. Objective The aim of this study was to identify the most common symptoms reported by patients with COVID-19, as well as the order of symptom appearance, by examining tweets in Arabic. Methods We searched Twitter posts in Arabic for personal reports of COVID-19 symptoms from March 1 to May 27, 2020. We identified 463 Arabic users who had tweeted about testing positive for COVID-19 and extracted the symptoms they associated with the disease. Furthermore, we asked them directly via personal messaging to rank the appearance of the first 3 symptoms they had experienced immediately before (or after) their COVID-19 diagnosis. Finally, we tracked their Twitter timeline to identify additional symptoms that were mentioned within ±5 days from the day of the first tweet on their COVID-19 diagnosis. In total, 270 COVID-19 self-reports were collected, and symptoms were (at least partially) ranked. Results The collected self-reports contained 893 symptoms from 201 (74%) male and 69 (26%) female Twitter users. The majority (n=270, 82%) of the tracked users were living in Saudi Arabia (n=125, 46%) and Kuwait (n=98, 36%). Furthermore, 13% (n=36) of the collected reports were from asymptomatic individuals. Of the 234 users with symptoms, 66% (n=180) provided a chronological order of appearance for at least 3 symptoms. Fever (n=139, 59%), headache (n=101, 43%), and anosmia (n=91, 39%) were the top 3 symptoms mentioned in the self-reports. Additionally, 28% (n=65) reported that their COVID-19 experience started with a fever, 15% (n=34) with a headache, and 12% (n=28) with anosmia. Of the 110 symptomatic cases from Saudi Arabia, the most common 3 symptoms were fever (n=65, 59%), anosmia (n=46, 42%), and headache (n=42, 38%). Conclusions This study identified the most common symptoms of COVID-19 from tweets in Arabic. These symptoms can be further analyzed in clinical settings and may be incorporated into a real-time COVID-19 risk estimator.

[1]  J. Brownstein,et al.  Assessing the Online Social Environment for Surveillance of Obesity Prevalence , 2013, PloS one.

[2]  Alok N. Choudhary,et al.  Real-time disease surveillance using Twitter data: demonstration on flu and cancer , 2013, KDD.

[3]  Scott H. Burton,et al.  Use of Twitter Among Local Health Departments: An Analysis of Information Sharing, Engagement, and Action , 2013, Journal of medical Internet research.

[4]  Mike Conway Ethical Issues in Using Twitter for Public Health Surveillance and Research: Developing a Taxonomy of Ethical Concepts From the Research Literature , 2014, Journal of medical Internet research.

[5]  Michael J. Paul,et al.  SOCIAL MEDIA MINING FOR PUBLIC HEALTH MONITORING AND SURVEILLANCE. , 2016, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[6]  Kevin A Padrez,et al.  Twitter as a Tool for Health Research: A Systematic Review , 2017, American journal of public health.

[7]  Zaher Al Aghbari,et al.  Analysis and prediction of influenza in the UAE based on Arabic tweets , 2018, 2018 IEEE 3rd International Conference on Big Data Analysis (ICBDA).

[8]  Jemal H. Abawajy,et al.  Tweetluenza: Predicting flu trends from twitter data , 2019, Big Data Min. Anal..

[9]  Jinfeng Li,et al.  A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency , 2019, 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS).

[10]  T. Mackey,et al.  Machine Learning to Detect Self-Reporting of Symptoms, Testing Access, and Recovery Associated With COVID-19 on Twitter: Retrospective Big Data Infoveillance Study , 2020, JMIR public health and surveillance.

[11]  Ruth Levinson,et al.  Time course of anosmia and dysgeusia in patients with mild SARS-CoV-2 infection , 2020, Infectious diseases.

[12]  A. Magge,et al.  A Chronological and Geographical Analysis of Personal Reports of COVID-19 on Twitter , 2020, medRxiv.

[13]  Rashid Mehmood,et al.  Sehaa: A Big Data Analytics Tool for Healthcare Symptoms and Diseases Detection Using Twitter, Apache Spark, and Machine Learning , 2020, Applied Sciences.

[14]  A. Sarker,et al.  Self-reported COVID-19 symptoms on Twitter: an analysis and a research resource , 2020, Journal of the American Medical Informatics Association : JAMIA.