SARS-CoV-2 genomic diversity and the implications for qRT-PCR diagnostics and transmission

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.

[1]  Evan T. Sholle,et al.  Shotgun transcriptome, spatial omics, and isothermal profiling of SARS-CoV-2 infection reveals unique host responses, viral diversification, and drug interactions , 2021, Nature Communications.

[2]  Michael T. Wolfinger,et al.  Genomic epidemiology of superspreading events in Austria reveals mutational dynamics and transmission properties of SARS-CoV-2 , 2020, Science Translational Medicine.

[3]  Leyi Wang,et al.  Global SNP analysis of 11,183 SARS‐CoV‐2 strains reveals high genetic diversity , 2020, Transboundary and emerging diseases.

[4]  D. Matthews,et al.  Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein , 2020, Genome Medicine.

[5]  Donna M. Muzny,et al.  Oligonucleotide capture sequencing of the SARS-CoV-2 genome and subgenomic fragments from COVID-19 individuals , 2020, bioRxiv.

[6]  D. Flichman,et al.  Phylogenetic analysis of SARS‐CoV‐2 in the first few months since its emergence , 2020, bioRxiv.

[7]  K. Khan,et al.  Presence of mismatches between diagnostic PCR assays and coronavirus SARS-CoV-2 genome , 2020, Royal Society Open Science.

[8]  M. Antoniotti,et al.  VERSO: A comprehensive framework for the inference of robust phylogenies and the quantification of intra-host genomic diversity of viral samples , 2020, bioRxiv.

[9]  Matthew T. Maurano,et al.  Sequencing identifies multiple early introductions of SARS-CoV-2 to the New York City Region , 2020, medRxiv : the preprint server for health sciences.

[10]  C. Farkas,et al.  Insights on early mutational events in SARS-CoV-2 virus reveal founder effects across geographical regions , 2020, bioRxiv.

[11]  MingKun Li,et al.  Genomic diversity of SARS-CoV-2 in Coronavirus Disease 2019 patients , 2020, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[12]  M. Torcia,et al.  Evidence for host-dependent RNA editing in the transcriptome of SARS-CoV-2 , 2020, bioRxiv.

[13]  E. Holmes,et al.  A new coronavirus associated with human respiratory disease in China , 2020, Nature.

[14]  Jonathan E. Allen,et al.  Multiscale analysis for patterns of Zika virus genotype emergence, spread, and consequence , 2019, PloS one.

[15]  Christophe Dessimoz,et al.  Structural variant calling: the long and the short of it , 2019, Genome Biology.

[16]  Bixing Huang,et al.  Illumina sequencing of clinical samples for virus detection in a public health laboratory , 2019, Scientific Reports.

[17]  M. Vignuzzi,et al.  Seasonal Genetic Drift of Human Influenza A Virus Quasispecies Revealed by Deep Sequencing , 2018, Front. Microbiol..

[18]  A. Pfeifer,et al.  Attenuation of replication by a 29 nucleotide deletion in SARS-coronavirus acquired during the early stages of human-to-human transmission , 2018, Scientific Reports.

[19]  A. Lauring,et al.  Complexities of Viral Mutation Rates , 2018, Journal of Virology.

[20]  C. Chou,et al.  Disulfiram can inhibit MERS and SARS coronavirus papain-like proteases via different modes , 2017, Antiviral Research.

[21]  Trevor Bedford,et al.  Nextstrain: real-time tracking of pathogen evolution , 2017, bioRxiv.

[22]  A. Lauring,et al.  A novel twelve class fluctuation test reveals higher than expected mutation rates for influenza A viruses , 2017, eLife.

[23]  Marc Lipsitch,et al.  Shared Genomic Variants: Identification of Transmission Routes Using Pathogen Deep-Sequence Data , 2017, American journal of epidemiology.

[24]  Katia Koelle,et al.  Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza A Virus , 2017, Journal of Virology.

[25]  Trevor Bedford,et al.  Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples , 2017, Nature Protocols.

[26]  Stefan Elbe,et al.  Data, disease and diplomacy: GISAID's innovative contribution to global health , 2017, Global challenges.

[27]  F. Balloux,et al.  Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast , 2016, Nature Communications.

[28]  Xiaoyu Chen,et al.  Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications , 2016, Bioinform..

[29]  M. Vignuzzi,et al.  Increasing Clinical Severity during a Dengue Virus Type 3 Cuban Epidemic: Deep Sequencing of Evolving Viral Populations , 2016, Journal of Virology.

[30]  Timothy B. Stockwell,et al.  Quantifying influenza virus diversity and transmission in humans , 2016, Nature Genetics.

[31]  Chase W. Nelson,et al.  SNPGenie: estimating evolutionary parameters to detect natural selection using pooled next-generation sequencing data , 2015, Bioinform..

[32]  S. Elena,et al.  Matters of Size: Genetic Bottlenecks in Virus Infection and Their Potential Impact on Evolution. , 2015, Annual review of virology.

[33]  I. Sola,et al.  Continuous and Discontinuous RNA Synthesis in Coronaviruses. , 2015, Annual review of virology.

[34]  Pablo R. Murcia,et al.  The use of next generation sequencing in the diagnosis and typing of respiratory infections , 2015, Journal of Clinical Virology.

[35]  Trevor Bedford,et al.  Ebola Virus Epidemiology, Transmission, and Evolution during Seven Months in Sierra Leone , 2015, Cell.

[36]  Brian D. Ondov,et al.  The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes , 2014, Genome Biology.

[37]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[38]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[39]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[40]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[41]  A. Wilm,et al.  LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets , 2012, Nucleic acids research.

[42]  Elizabeth M. Ryan,et al.  Genome-Wide Patterns of Intrahuman Dengue Virus Diversity Reveal Associations with Viral Phylogenetic Clade and Interhost Diversity , 2012, Journal of Virology.

[43]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[44]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[45]  R. Baric,et al.  Coronaviruses , 2011, RNA biology.

[46]  Marco J. Morelli,et al.  Beyond the Consensus: Dissecting Within-Host Viral Population Diversity of Foot-and-Mouth Disease Virus by Using Next-Generation Genome Sequencing , 2010, Journal of Virology.

[47]  Timothy B. Stockwell,et al.  Infidelity of SARS-CoV Nsp14-Exonuclease Mutant Virus Replication Is Revealed by Complete Genome Sequencing , 2010, PLoS pathogens.

[48]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[49]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[50]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[51]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[52]  J. Ziebuhr,et al.  Nidovirales: Evolving the largest RNA virus genome , 2006, Virus Research.

[53]  D. Whiley,et al.  Sequence variation in primer targets affects the accuracy of viral quantitative PCR. , 2005, Journal of clinical virology : the official publication of the Pan American Society for Clinical Virology.

[54]  Gordana Pavlovic-Lazetic,et al.  Bioinformatics analysis of SARS coronavirus genome polymorphism , 2004, BMC Bioinformatics.

[55]  J. Drake,et al.  Mutation rates among RNA viruses. , 1999, Proceedings of the National Academy of Sciences of the United States of America.