Identifying SARS-CoV-2 infected cells with scVDN

Introduction Single-cell RNA sequencing (scRNA-seq) is a powerful tool for understanding cellular heterogeneity and identifying cell types in virus-related research. However, direct identification of SARS-CoV-2-infected cells at the single-cell level remains challenging, hindering the understanding of viral pathogenesis and the development of effective treatments. Methods In this study, we propose a deep learning framework, the single-cell virus detection network (scVDN), to predict the infection status of single cells. The scVDN is trained on scRNA-seq data from multiple nasal swab samples obtained from several contributors with varying cell types. To objectively evaluate scVDN’s performance, we establish a model evaluation framework suitable for real experimental data. Results and Discussion Our results demonstrate that scVDN outperforms four state-of-the-art machine learning models in identifying SARS-CoV-2-infected cells, even with extremely imbalanced labels in real data. Specifically, scVDN achieves a perfect AUC score of 1 in four cell types. Our findings have important implications for advancing virus research and improving public health by enabling the identification of virus-infected cells at the single-cell level, which is critical for diagnosing and treating viral infections. The scVDN framework can be applied to other single-cell virus-related studies, and we make all source code and datasets publicly available on GitHub at https://github.com/studentiz/scvdn.

[1]  Fabian J Theis,et al.  Best practices for single-cell analysis across modalities , 2023, Nature Reviews Genetics.

[2]  Fei Xu,et al.  Gene function and cell surface protein association analysis based on single-cell multiomics data , 2023, Comput. Biol. Medicine.

[3]  F. Chen,et al.  Modeling and analyzing single-cell multimodal data with deep parametric inference , 2023, Briefings Bioinform..

[4]  Jianqiang Sun,et al.  Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism , 2022, Comput. Biol. Medicine.

[5]  R. Chai Single-Cell RNA Sequencing: Unravelling the Bone One Cell at a Time , 2022, Current Osteoporosis Reports.

[6]  S. Weissman,et al.  Sample-multiplexing approaches for single-cell sequencing , 2022, Cellular and molecular life sciences : CMLS.

[7]  Xiang Li,et al.  Caspase-1 and Gasdermin D Afford the Optimal Targets with Distinct Switching Strategies in NLRP1b Inflammasome-Induced Cell Death , 2022, Research.

[8]  Jianqiang Sun,et al.  A deep learning method for predicting metabolite-disease associations via graph neural network , 2022, Briefings Bioinform..

[9]  P. L. Sánchez,et al.  Corazón y SARS-CoV-2 , 2022, Medicina Clínica.

[10]  R. Jalan,et al.  SARS-CoV-2 infection and liver involvement , 2022, Hepatology International.

[11]  Harikrishnan Nellippallil Balakrishnan,et al.  Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning , 2022, Medical & Biological Engineering & Computing.

[12]  D. Primorac,et al.  Adaptive Immune Responses and Immunity to SARS-CoV-2 , 2022, Frontiers in Immunology.

[13]  B. Haagmans,et al.  SARS-CoV-2 pathogenesis , 2022, Nature Reviews Microbiology.

[14]  Zhengquan Yu,et al.  CITEMOXMBD: A flexible single-cell multimodal omics analysis framework to reveal the heterogeneity of immune cells , 2022, RNA biology.

[15]  Lindsay N. Carpp,et al.  Single-cell immunology of SARS-CoV-2 infection , 2021, Nature Biotechnology.

[16]  C. Ziegler,et al.  Impaired local intrinsic immunity to SARS-CoV-2 infection in severe COVID-19 , 2021, Cell.

[17]  V. Barvkar,et al.  SARS‐CoV‐2, the pandemic coronavirus: Molecular and structural insights , 2021, Journal of basic microbiology.

[18]  Chuan-Qi Zhong,et al.  RIP1-dependent linear and nonlinear recruitments of caspase-8 and RIP3 respectively to necrosome specify distinct cell death outcomes , 2021, Protein & cell.

[19]  F. Zamani,et al.  Evolutionary study of COVID‐19, severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) as an emerging coronavirus: Phylogenetic analysis and literature review , 2020, Veterinary medicine and science.

[20]  Q. Gao,et al.  Probing infectious disease by single-cell RNA sequencing: Progresses and perspectives , 2020, Computational and Structural Biotechnology Journal.

[21]  Zhènglì Shí,et al.  Characteristics of SARS-CoV-2 and COVID-19 , 2020, Nature Reviews Microbiology.

[22]  F. Hutter,et al.  Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning , 2020, J. Mach. Learn. Res..

[23]  A. Chetta,et al.  Beyond the lung involvement in COVID-19 patients. A review. , 2020, Minerva medica.

[24]  Hong Yan,et al.  scTSSR: gene expression recovery for single-cell RNA sequencing using two-side sparse self-representation , 2020, Bioinform..

[25]  Gyu Sang Choi,et al.  Duplicate Questions Pair Detection Using Siamese MaLSTM , 2020, IEEE Access.

[26]  D. Chicco,et al.  The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation , 2020, BMC Genomics.

[27]  Zhiyuan Zhang,et al.  Understanding and Improving Layer Normalization , 2019, NeurIPS.

[28]  Gui-Bin Bian,et al.  Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications , 2018, IEEE Access.

[29]  Hyrum S. Anderson,et al.  Detecting Homoglyph Attacks with a Siamese Neural Network , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[30]  Samuel Berlemont,et al.  Class-balanced siamese neural networks , 2018, Neurocomputing.

[31]  Edward J. Delp,et al.  A Two Stream Siamese Convolutional Neural Network for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  O. Ornatsky,et al.  Single‐cell measurement of the uptake, intratumoral distribution and cell cycle effects of cisplatin using mass cytometry , 2015, International journal of cancer.

[33]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[34]  Shiliang Sun,et al.  An adaptive k-nearest neighbor algorithm , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[35]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[36]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[37]  Renée M. Casbergue,et al.  Evolution of Novice through Expert Teachers' Recall: Implications for Effective Reflection on Practice. , 1997 .

[38]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[39]  Davide Chicco,et al.  Siamese Neural Networks: An Overview , 2021, Artificial Neural Networks, 3rd Edition.

[40]  Derong Liu,et al.  Neural Information Processing , 2017, Lecture Notes in Computer Science.

[41]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[42]  K. Hajian‐Tilaki,et al.  Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation. , 2013, Caspian journal of internal medicine.

[43]  Fredric C. Gey,et al.  The Relationship between Recall and Precision , 1994, J. Am. Soc. Inf. Sci..

[44]  Ray C. Fair,et al.  Evaluating the predictive accuracy of models , 1986 .