Identifying and pro�ling structural similarities between Spike of SARS-CoV-2 and other viral or host proteins with Machaon

Using protein structure to predict function, interactions, and evolutionary history is still an open challenge, with existing approaches relying extensively on protein homology and families. Here, we present Machaon, a new data-driven method combining orientation invariant metrics on phi-psi angles, inter-residue contacts and surface complexity. It can be readily applied on whole structures or segments — such as domains and binding sites. Machaon was applied on SARS-CoV-2 Spike monomers of native, Delta and Omicron variants and identified correlations with a wide range of viral proteins from close to distant taxonomy ranks, as well as host receptors, such as ACE2. The associated whole structures have: 32.7% and 38.5% sequence identities on protein secondary structure and gene coding region, 86% chemical and 22.1% protein tertiary structure similarities (median values, native Spike). Machaon’s meta-analysis revealed Spike’s relationships with biological processes such as ubiquitination and angiogenesis and highlighted different patterns in virus attachment among the studied variants. Available at: https://github.com/anastasiadoulab/machaon .

[1]  Benjamin Bowe,et al.  Long-term cardiovascular outcomes of COVID-19 , 2022, Nature Medicine.

[2]  S. Subramaniam,et al.  SARS-CoV-2 Omicron variant: Antibody evasion and cryo-EM structure of spike protein–ACE2 complex , 2022, Science.

[3]  Xinmiao Liang,et al.  SARS-CoV-2 spike protein causes blood coagulation and thrombosis by competitive binding to heparan sulfate , 2021, International Journal of Biological Macromolecules.

[4]  Bosco K. Ho,et al.  SARS‐CoV‐2 structural coverage map reveals viral protein assembly, mimicry, and hijacking mechanisms , 2021, Molecular systems biology.

[5]  V. Khavinson,et al.  Homology between SARS CoV-2 and human proteins , 2021, Scientific Reports.

[6]  K. Kavukcuoglu,et al.  Highly accurate protein structure prediction for the human proteome , 2021, Nature.

[7]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[8]  F. Facchetti,et al.  SARS-CoV-2 Infection Remodels the Phenotype and Promotes Angiogenesis of Primary Human Lung Endothelial Cells , 2021, Microorganisms.

[9]  E. Petsalaki,et al.  Use of viral motif mimicry improves the proteome-wide discovery of human linear motifs , 2021, bioRxiv.

[10]  Tuan-Duy H. Nguyen,et al.  SHREC 2021: Retrieval and classification of protein surfaces equipped with physical and chemical properties , 2021, Comput. Graph..

[11]  Timothy L. Tickle,et al.  COVID-19 tissue atlases reveal SARS-CoV-2 pathology and cellular targets , 2021, Nature.

[12]  Andries J. van Tonder,et al.  Predicted structural mimicry of spike receptor-binding motifs from highly pathogenic human coronaviruses , 2021, bioRxiv.

[13]  Y. Tong,et al.  Ubiquitin-Modified Proteome of SARS-CoV-2-Infected Host Cells Reveals Insights into Virus–Host Interaction and Pathogenesis , 2021, Journal of proteome research.

[14]  Zhibin Hu,et al.  SARS-CoV-2 encoded microRNAs are involved in the process of virus infection and host immune response , 2021, Journal of biomedical research.

[15]  P. Shi,et al.  Ubiquitination of SARS-CoV-2 ORF7a promotes antagonism of interferon response , 2021, Cellular & Molecular Immunology.

[16]  E. Lander,et al.  The SARS-CoV-2 RNA–protein interactome in infected human cells , 2020, Nature Microbiology.

[17]  Peter B. McGarvey,et al.  UniProt: the universal protein knowledgebase in 2021 , 2020, Nucleic Acids Res..

[18]  Radka Svobodová Vareková,et al.  CATH: increased structural coverage of functional space , 2020, Nucleic Acids Res..

[19]  B. Honig,et al.  A Sweep of Earth’s Virome Reveals Host-Guided Viral Protein Structural Mimicry and Points to Determinants of Human Disease , 2020, Cell Systems.

[20]  S. Ng,et al.  Temporal landscape of human gut RNA and DNA virome in SARS-CoV-2 infection and severity , 2020, Microbiome.

[21]  G. Barra,et al.  Co-infection of SARS-CoV-2 and dengue virus: a clinical challenge , 2020, The Brazilian Journal of Infectious Diseases.

[22]  Lin Wang,et al.  Co‐reactivation of the human herpesvirus alpha subfamily (herpes simplex virus‐1 and varicella zoster virus) in a critically ill patient with COVID‐19 , 2020, The British journal of dermatology.

[23]  Li Yang,et al.  COVID-19: immunopathogenesis and Immunotherapeutics , 2020, Signal Transduction and Targeted Therapy.

[24]  Haidong Gu,et al.  Specificity in Ubiquitination Triggered by Virus Infection , 2020, International journal of molecular sciences.

[25]  O. Karabay,et al.  Is there relationship between SARS-CoV-2 and the complement C3 and C4? , 2020, Turkish journal of medical sciences.

[26]  Jianhong Lu,et al.  The MERS-CoV Receptor DPP4 as a Candidate Binding Target of the SARS-CoV-2 Spike , 2020, iScience.

[27]  C. Pasquier,et al.  Computational search of hybrid human/SARS-CoV-2 dsRNA reveals unique viral sequences that diverge from those of other coronavirus strains , 2020, bioRxiv.

[28]  Fabian J Theis,et al.  SARS-CoV-2 Entry Genes Are Most Highly Expressed in Nasal Goblet and Ciliated Cells within Human Airways , 2020, Nature Medicine.

[29]  Shuye Zhang,et al.  Single cell RNA sequencing of 13 human tissues identify cell types and receptors of human coronaviruses , 2020, bioRxiv.

[30]  Young-Jun Park,et al.  Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein , 2020, Cell.

[31]  B. Graham,et al.  Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation , 2020, Science.

[32]  Liisa Holm,et al.  DALI and the persistence of protein shape , 2019, Protein science : a publication of the Protein Society.

[33]  Liisa Holm,et al.  Benchmarking fold detection by DaliLite v.5 , 2019, Bioinform..

[34]  Hui Liu,et al.  MADOKA: an ultra-fast approach for large-scale protein structure similarity searching , 2019, BMC Bioinformatics.

[35]  I. Chang,et al.  Structural insight into conformational change in prion protein by breakage of electrostatic network around H187 due to its protonation , 2019, Scientific Reports.

[36]  Stephen K. Burley,et al.  Real time structural search of the Protein Data Bank , 2020, PLoS Comput. Biol..

[37]  Yugyung Lee,et al.  RUPEE: A fast and accurate purely geometric protein structure search , 2018, bioRxiv.

[38]  Anuj Sharma,et al.  Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC , 2018, PloS one.

[39]  Ashutosh Kumar Singh,et al.  Deciphering the dark proteome of Chikungunya virus , 2018, Scientific Reports.

[40]  H. Fearnhead,et al.  Viral hijacking of host caspases: an emerging category of pathogen–host interactions , 2017, Cell Death and Differentiation.

[41]  Leland McInnes,et al.  hdbscan: Hierarchical density based clustering , 2017, J. Open Source Softw..

[42]  Jennifer L. Knight,et al.  OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. , 2016, Journal of chemical theory and computation.

[43]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[44]  P. Devreotes,et al.  Opening the conformation is a master switch for the dual localization and phosphatase activity of PTEN , 2015, Scientific Reports.

[45]  A. Mildner,et al.  RNA viruses can hijack vertebrate microRNAs to suppress innate immunity , 2013, Nature.

[46]  Woody Sherman,et al.  Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments , 2013, Journal of Computer-Aided Molecular Design.

[47]  Tomasz Zok,et al.  MCQ4Structures to compute similarity of molecule structures , 2013, Central European Journal of Operations Research.

[48]  T. Weikl,et al.  How conformational changes can affect catalysis, inhibition and drug resistance of enzymes with induced-fit binding mechanism such as the HIV-1 protease. , 2013, Biochimica et biophysica acta.

[49]  S. Parasuraman,et al.  Protein data bank , 2012, Journal of pharmacology & pharmacotherapeutics.

[50]  Jeung-Hoi Ha,et al.  Protein conformational switches: from nature to design. , 2012, Chemistry.

[51]  Jan H. Jensen,et al.  PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. , 2011, Journal of chemical theory and computation.

[52]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[53]  A. Elofsson,et al.  Structure is three to ten times more conserved than sequence—A study of structural response in protein cores , 2009, Proteins.

[54]  Andreas Bender,et al.  Alpha Shapes Applied to Molecular Shape Characterization Exhibit Novel Properties Compared to Established Shape Descriptors , 2009, J. Chem. Inf. Model..

[55]  Rachael P. Huntley,et al.  QuickGO: a web-based tool for Gene Ontology searching , 2009, Bioinform..

[56]  Thomas A. Halgren,et al.  Identifying and Characterizing Binding Sites and Assessing Druggability , 2009, J. Chem. Inf. Model..

[57]  Fang Li,et al.  Structural Analysis of Major Species Barriers between Humans and Palm Civets for Severe Acute Respiratory Syndrome Coronavirus Infections , 2008, Journal of Virology.

[58]  Homayoun Valafar,et al.  Tali: Local Alignment of protein Structures Using Backbone Torsion Angles , 2008, J. Bioinform. Comput. Biol..

[59]  J. Skolnick,et al.  Scoring function for automated assessment of protein structure template quality , 2007 .

[60]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[61]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[62]  B. Honig,et al.  A hierarchical approach to all‐atom protein loop prediction , 2004, Proteins.

[63]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[64]  P. Argos,et al.  Knowledge‐based protein secondary structure assignment , 1995, Proteins.

[65]  Herbert Edelsbrunner,et al.  Three-dimensional alpha shapes , 1992, VVS.

[66]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[67]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[68]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[69]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[70]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[71]  R. Henderson,et al.  Three-dimensional model of purple membrane obtained by electron microscopy , 1975, Nature.

[72]  S. Gregory,et al.  Cleavage and polyadenylation specificity factor 1 (CPSF1) regulates alternative splicing of interleukin 7 receptor (IL7R) exon 6. , 2013, RNA.

[73]  Ruben Abagyan,et al.  Methods of protein structure comparison. , 2012, Methods in molecular biology.

[74]  Antonio Alcami,et al.  Viral mimicry of cytokines, chemokines and their receptors , 2003, Nature Reviews Immunology.