Characterisation of the transcriptome and proteome of SARS-CoV-2 using direct RNA sequencing and tandem mass spectrometry reveals evidence for a cell passage induced in-frame deletion in the spike glycoprotein that removes the furin-like cleavage site

Direct RNA sequencing using an Oxford Nanopore MinION characterised the transcriptome of SARS-CoV-2 grown in Vero E6 cells. This cell line is being widely used to propagate the novel coronavirus. The viral transcriptome was analysed using a recently developed ORF-centric pipeline. This revealed the pattern of viral transcripts, (i.e. subgenomic mRNAs), generally fitted the predicted replication and transcription model for coronaviruses. A 24 nt in-frame deletion was detected in subgenomic mRNAs encoding the spike (S) glycoprotein. This feature was identified in over half of the mapped transcripts and was predicted to remove a proposed furin cleavage site from the S glycoprotein. This motif directs cleavage of the S glycoprotein into functional subunits during virus entry or exit. Cleavage of the S glycoprotein can be a barrier to zoonotic coronavirus transmission and affect viral pathogenicity. Allied to this transcriptome analysis, tandem mass spectrometry was used to identify over 500 viral peptides and 44 phosphopeptides, covering almost all of the proteins predicted to be encoded by the SARS-CoV-2 genome, including peptides unique to the deleted variant of the S glycoprotein. Detection of an apparently viable deletion in the furin cleavage site of the S glycoprotein reinforces the point that this and other regions of SARS-CoV-2 proteins may readily mutate. This is of clear significance given the interest in the S glycoprotein as a potential vaccine target and the observation that the furin cleavage site likely contributes strongly to the pathogenesis and zoonosis of this virus. The viral genome sequence should be carefully monitored during the growth of viral stocks for research, animal challenge models and, potentially, in clinical samples. Such variations may result in different levels of virulence, morbidity and mortality.

[1]  Nichollas E. Scott,et al.  Direct RNA sequencing and early evolution of SARS-CoV-2 , 2020, bioRxiv.

[2]  A. Walls,et al.  Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein , 2020, Cell.

[3]  E. Holmes,et al.  A new coronavirus associated with human respiratory disease in China , 2020, Nature.

[4]  Stuart G. Siddell,et al.  A Contemporary View of Coronavirus Transcription , 2006, Journal of Virology.

[5]  Hyeshik Chang,et al.  The Architecture of SARS-CoV-2 Transcriptome , 2020, Cell.

[6]  Yasuko Mori,et al.  Direct RNA sequencing on nanopore arrays redefines the transcriptional complexity of a viral pathogen , 2019, Nature Communications.

[7]  Nanopore native RNA sequencing of a human poly(A) transcriptome , 2019, Nature Methods.

[8]  B. Canard,et al.  The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade , 2020, Antiviral Research.

[9]  G. Whittaker,et al.  Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites , 2009, Proceedings of the National Academy of Sciences.

[10]  G. Herrler,et al.  SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor , 2020, Cell.

[11]  S. Perlman,et al.  Coronaviruses: An Overview of Their Replication and Pathogenesis , 2015, Methods in molecular biology.

[12]  Wenhui Li,et al.  Conformational States of the Severe Acute Respiratory Syndrome Coronavirus Spike Protein Ectodomain , 2006, Journal of Virology.

[13]  Kai Zhao,et al.  A pneumonia outbreak associated with a new coronavirus of probable bat origin , 2020, Nature.

[14]  Y. Guan,et al.  Unique and Conserved Features of Genome and Proteome of SARS-coronavirus, an Early Split-off From the Coronavirus Group 2 Lineage , 2003, Journal of Molecular Biology.

[15]  T. Clark,et al.  Human Coronavirus NL63 Molecular Epidemiology and Evolutionary Patterns in Rural Coastal Kenya , 2018 .

[16]  D. Veesler,et al.  Structural insights into coronavirus entry , 2019, Advances in Virus Research.

[17]  B. Graham,et al.  Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation , 2020, Science.

[18]  J. Hiscox,et al.  Mass Spectroscopic Characterization of the Coronavirus Infectious Bronchitis Virus Nucleoprotein and Elucidation of the Role of Phosphorylation in RNA Binding by Using Surface Plasmon Resonance , 2005, Journal of Virology.

[19]  Shoshannah L. Roth,et al.  Characterization of a Highly Conserved Domain within the Severe Acute Respiratory Syndrome Coronavirus Spike Protein S2 Domain with Characteristics of a Viral Fusion Peptide , 2009, Journal of Virology.

[20]  G. Whittaker,et al.  Elastase-mediated Activation of the Severe Acute Respiratory Syndrome Coronavirus Spike Protein at Discrete Sites within the S2 Domain , 2010, The Journal of Biological Chemistry.

[21]  D. Matthews,et al.  Proteomics informed by transcriptomics for characterising active transposable elements and genome annotation in Aedes aegypti , 2017, BMC Genomics.

[22]  Barney S. Graham,et al.  Pre-fusion structure of a human coronavirus spike protein , 2016, Nature.

[23]  D. Matthews,et al.  Deep splicing plasticity of the human adenovirus type 5 transcriptome drives virus evolution , 2020, Communications Biology.

[24]  S. Perlman,et al.  Coronaviruses post-SARS: update on replication and pathogenesis , 2009, Nature Reviews Microbiology.

[25]  L. Reed,et al.  A SIMPLE METHOD OF ESTIMATING FIFTY PER CENT ENDPOINTS , 1938 .

[26]  J. Hiscox,et al.  Role of phosphorylation clusters in the biology of the coronavirus infectious bronchitis virus nucleocapsid protein , 2007, Virology.

[27]  Changsheng Zhang,et al.  Crystal structure of SARS-CoV-2 nucleocapsid protein RNA binding domain reveals potential unique drug targeting sites , 2020, bioRxiv.

[28]  Gavin J. D. Smith,et al.  Discovery of a 382-nt deletion during the early evolution of SARS-CoV-2 , 2020, bioRxiv.

[29]  B. Fielding,et al.  The Coronavirus Nucleocapsid Is a Multifunctional Protein , 2014, Viruses.

[30]  B. Bosch,et al.  The Coronavirus Spike Protein Is a Class I Virus Fusion Protein: Structural and Functional Characterization of the Fusion Core Complex , 2003, Journal of Virology.

[31]  J. Hiscox,et al.  Investigation of the control of coronavirus subgenomic mRNA transcription by using T7-generated negative-sense RNA transcripts , 1995, Journal of virology.

[32]  Fang Li,et al.  Structure, Function, and Evolution of Coronavirus Spike Proteins. , 2016, Annual review of virology.

[33]  D. Matthews,et al.  De novo derivation of proteomes from transcriptomes for transcript and protein identification , 2012, Nature Methods.

[34]  S. Ciesek,et al.  SARS-CoV-2 infected host cell proteomics reveal potential therapy targets , 2020 .

[35]  P. Britton,et al.  Experimental Evidence of Recombination in Coronavirus Infectious Bronchitis Virus , 1995, Virology.

[36]  Kuan-rong Lee,et al.  Phosphorylation of the arginine/serine dipeptide‐rich motif of the severe acute respiratory syndrome coronavirus nucleocapsid protein modulates its multimerization, translation inhibitory activity and cellular localization , 2008, The FEBS journal.

[37]  B. Bosch,et al.  Cathepsin L Functionally Cleaves the Severe Acute Respiratory Syndrome Coronavirus Class I Fusion Protein Upstream of Rather than Adjacent to the Fusion Peptide , 2008, Journal of Virology.

[38]  Daniel R. Garalde,et al.  Highly parallel direct RNA sequencing on an array of nanopores , 2016, Nature Methods.

[39]  E. Holmes,et al.  The proximal origin of SARS-CoV-2 , 2020, Nature Medicine.

[40]  Hyeshik Chang,et al.  The Architecture of SARS-CoV-2 Transcriptome , 2020, Cell.

[41]  Gary R. Whittaker,et al.  Host cell proteases: Critical determinants of coronavirus tropism and pathogenesis , 2014, Virus Research.

[42]  E. Dong,et al.  An interactive web-based dashboard to track COVID-19 in real time , 2020, The Lancet Infectious Diseases.

[43]  Manja Marz,et al.  Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis , 2018, bioRxiv.

[44]  D. Matthews,et al.  High Resolution Analysis of Respiratory Syncytial Virus Infection In Vivo , 2019, Viruses.

[45]  D. Matthews,et al.  Proteomics informed by transcriptomics reveals Hendra virus sensitizes bat cells to TRAIL-mediated apoptosis , 2014, Genome Biology.

[46]  Ding‐Shinn Chen,et al.  Glycogen Synthase Kinase-3 Regulates the Phosphorylation of Severe Acute Respiratory Syndrome Coronavirus Nucleocapsid Protein and Viral Replication* , 2009, The Journal of Biological Chemistry.

[47]  Christian Drosten,et al.  Cleavage and Activation of the Severe Acute Respiratory Syndrome Coronavirus Spike Protein by Human Airway Trypsin-Like Protease , 2011, Journal of Virology.

[48]  Pei-Jer Chen,et al.  Nucleocapsid Phosphorylation and RNA Helicase DDX1 Recruitment Enables Coronavirus Transition from Discontinuous to Continuous Transcription , 2014, Cell Host & Microbe.