Can Self Reported Symptoms Predict Daily COVID-19 Cases?

The COVID-19 pandemic has impacted lives and economies across the globe, leading to many deaths. While vaccination is an important intervention, its roll-out is slow and unequal across the globe. Therefore, extensive testing still remains one of the key methods to monitor and contain the virus. Testing on a large scale is expensive and arduous. Hence, we need alternate methods to estimate the number of cases. Online surveys have been shown to be an effective method for data collection amidst the pandemic. In this work, we develop machine learning models to estimate the prevalence of COVID-19 using self-reported symptoms. Our best model predicts the daily cases with a mean absolute error (MAE) of 226.30 (normalized MAE of 27.09%) per state, which demonstrates the possibility of predicting the actual number of confirmed cases by utilizing self-reported symptoms. The models are developed at two levels of data granularity - local models, which are trained at the state level, and a single global model which is trained on the combined data aggregated across all states. Our results indicate a lower error on the local models as opposed to the global model. In addition, we also show that the most important symptoms (features) vary considerably from state to state. This work demonstrates that the models developed on crowd-sourced data, curated via online platforms, can complement the existing epidemiological surveillance infrastructure in a cost-effective manner. The code is publicly available at https://github.com/parthpatwa/Can-Self-Reported-Symptoms-Predict-Daily-COVID-19-Cases.

[1]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[2]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[3]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  John Bradley,et al.  Strong associations and moderate predictive value of early symptoms for SARS-CoV-2 test positivity among healthcare workers, the Netherlands, March 2020 , 2020, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[6]  M. Mills,et al.  The challenges of distributing COVID-19 vaccinations , 2020, EClinicalMedicine.

[7]  Simran Kaur,et al.  COVID-19 Vaccine: A comprehensive status report , 2020, Virus Research.

[8]  Ramesh Raskar,et al.  Proximity Sensing: Modeling and Understanding Noisy RSSI-BLE Signals and Other Mobile Sensor Data for Digital Contact Tracing , 2020 .

[9]  Ramesh Raskar,et al.  Clinical Landscape of COVID-19 Testing: Difficult Choices , 2020, 2011.04202.

[10]  D. Jarrom,et al.  Effectiveness of tests to detect the presence of SARS-CoV-2 virus, and antibodies to SARS-CoV-2, to inform COVID-19 diagnosis: a rapid systematic review , 2020, BMJ Evidence-Based Medicine.

[11]  Jennifer Collins,et al.  Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development , 2020, Frontiers in Artificial Intelligence.

[12]  C. Jessica E. Metcalf,et al.  Immune life history, vaccination, and the dynamics of SARS-CoV-2 over the next 5 years , 2020, Science.

[13]  Z. Fayad,et al.  Artificial intelligence–enabled rapid diagnosis of patients with COVID-19 , 2020, Nature Medicine.

[14]  M. A. Bazaz,et al.  A deep learning algorithm for modeling and forecasting of COVID-19 in five worst affected states of India , 2020, Alexandria Engineering Journal.

[15]  Lamiaa A. Amar,et al.  Prediction of the final size for COVID-19 epidemic using machine learning: A case study of Egypt , 2020, Infectious Disease Modelling.

[16]  Riley O. Mummah,et al.  Estimated effectiveness of symptom and risk screening to prevent the spread of COVID-19 , 2020, eLife.

[17]  Sonali Agarwal,et al.  COVID-19 Epidemic Analysis using Machine Learning and Deep Learning Algorithms , 2020, medRxiv.

[18]  M. Jorge Cardoso,et al.  Real-time tracking of self-reported symptoms to predict potential COVID-19 , 2020, Nature Medicine.

[19]  A. Aliper,et al.  Potential 2019-nCoV 3C-like Protease Inhibitors Designed Using Generative Deep Learning Approaches , 2020 .

[20]  Sai Huang,et al.  A Novel Triage Tool of Artificial Intelligence-Assisted Diagnosis Aid System for Suspected COVID-19 Pneumonia in Fever Clinics , 2020, medRxiv.

[21]  M. Mckenney,et al.  A Closer Look Into Global Hospital Beds Capacity and Resource Shortages During the COVID-19 Pandemic , 2020, Journal of Surgical Research.

[22]  Milind Tambe,et al.  Tracking disease outbreaks from sparse data with Bayesian inference , 2020, AAAI.

[23]  R. Raskar,et al.  Public health impact of delaying second dose of BNT162b2 or mRNA-1273 covid-19 vaccine: simulation agent based modeling study , 2021, BMJ.

[24]  Dinggang Shen,et al.  Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation, and Diagnosis for COVID-19 , 2020, IEEE Reviews in Biomedical Engineering.

[25]  Ramesh Raskar,et al.  COVID-19 Tests Gone Rogue: Privacy, Efficacy, Mismanagement and Misunderstandings , 2021, ArXiv.

[26]  Mehrab Singh Gill,et al.  Vaccination Worldwide: Strategies, Distribution and Challenges , 2021, 2107.14139.

[27]  Stephen Farrell,et al.  Measurement-based evaluation of Google/Apple Exposure Notification API for proximity detection in a commuter bus , 2020, PloS one.

[28]  R. Raskar,et al.  COVID-19 Outbreak Prediction and Analysis using Self Reported Symptoms , 2020, Journal of Behavioral Data Science.

[29]  R. Raskar,et al.  Challenges of Equitable Vaccine Distribution in the COVID-19 Pandemic , 2020, 2012.12263.