Community Mitigation: A Data-driven System for COVID-19 Risk Assessment in a Hierarchical Manner

The fast evolving and deadly outbreak of coronavirus disease (COVID-19) has posed grand challenges to human society. To slow the spread of virus infections and better respond with actionable strategies for community mitigation, leveraging the large-scale and real-time pandemic related data generated from heterogeneous sources (e.g., disease related data, demographic data, mobility data, and social media data), in this work, we propose and develop a data-driven system (named α-satellite), as an initial offering, to provide real-time COVID-19 risk assessment in a hierarchical manner in the United States. More specifically, given a location (either user input or automatic positioning), the system will automatically provide risk indices associated with the specific location, the county that location is in and the state as a whole to enable people to select appropriate actions for protection while minimizing disruptions to daily life to the extent possible. In α-satellite, we first construct an attributed heterogeneous information network (AHIN) to model the collected multi-source data in a comprehensive way; and then we utilize meta-path based schemes to model both vertical and horizontal information associated with a given location (i.e., point of interest, POI); finally we devise a novel heterogeneous graph neural network to aggregate its neighborhood information to estimate the risk of the given POI in a hierarchical manner. To comprehensively evaluate the performance of α-satellite in real-time COVID-19 risk assessment, a set of studies are first performed to validate its utility; based on a real-world dataset consisting of 6,538 annotated POIs, the experimental results show that α-satellite achieves the area of under curve (AUC) of 0.9378, which outperforms the state-of-the-art baselines. After we launched the system for public tests, it had attracted 51,190 users as of May 30. Based on the analysis of its large-scale users, we have a key finding that people from more severe regions (i.e., with larger numbers of COVID-19 cases) have stronger interests using the system for actionable information. Our system and generated benchmark datasets have been made publicly accessible through our website.

[1]  S. Sitharama Iyengar,et al.  A Survey on Malware Detection Using Data Mining Techniques , 2017, ACM Comput. Surv..

[2]  Li Yan,et al.  A machine learning-based model for survival prediction in patients with severe COVID-19 infection , 2020, medRxiv.

[3]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[4]  Gurjit S. Randhawa,et al.  Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: COVID-19 case study , 2020, bioRxiv.

[5]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[6]  Lian-lian Wu,et al.  Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study , 2020, medRxiv.

[7]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[8]  Wiebke Wagner,et al.  Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit , 2010, Lang. Resour. Evaluation.

[9]  Yanfang Ye,et al.  Gotcha - Sly Malware!: Scorpion A Metagraph2vec Based Malware Detection System , 2018, KDD.

[10]  Weiya Shi,et al.  A deep learning-based quantitative computed tomography model for predicting the severity of COVID-19: a retrospective study of 196 patients , 2020, Annals of translational medicine.

[11]  Kia Jahanbin,et al.  Using twitter and web news mining to predict COVID-19 outbreak , 2020 .

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[14]  Arni S. R. Srinivasa Rao,et al.  Identification of COVID-19 can be quicker through artificial intelligence framework using a mobile phone–based survey when cities and towns are under quarantine , 2020, Infection Control & Hospital Epidemiology.

[15]  Yanfang Ye,et al.  Out-of-sample Node Representation Learning for Heterogeneous Graph in Real-time Android Malware Detection , 2019, IJCAI.

[16]  Yanfang Ye,et al.  αCyber: Enhancing Robustness of Android Malware Detection System against Adversarial Attacks on Heterogeneous Graph based Model , 2019, CIKM.

[17]  Yongliang Li,et al.  Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation , 2019, KDD.

[18]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[19]  Shouhuai Xu,et al.  ICSD: An Automatic System for Insecure Code Snippet Detection in Stack Overflow over Heterogeneous Information Network , 2018, ACSAC.

[20]  Rao Asrs,et al.  Identification of COVID-19 Can be Quicker through Artificial Intelligence framework using a Mobile Phone-Based Survey in the Populations when Cities/Towns Are under Quarantine , 2020 .

[21]  Kevin R. Williams,et al.  Measuring Movement and Social Contact with Smartphone Data: A Real-Time Application to COVID-19 , 2020 .

[22]  Yanfang Ye,et al.  HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network , 2017, KDD.

[23]  Xin Li,et al.  DeepAM: a heterogeneous deep learning framework for intelligent malware detection , 2018, Knowledge and Information Systems.

[24]  Slav W Hermanowicz,et al.  Forecasting the Wuhan coronavirus (2019-nCoV) epidemics using a simple (simplistic) model - update (Feb. 8, 2020) , 2020, medRxiv.

[25]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[26]  Bo Xu,et al.  A deep learning algorithm using CT images to screen for Corona virus disease (COVID-19) , 2020, European Radiology.

[27]  Xiang Li,et al.  Semi-supervised Clustering in Attributed Heterogeneous Information Networks , 2017, WWW.