Oligonucleotide capture sequencing of the SARS-CoV-2 genome and subgenomic fragments from COVID-19 individuals

The newly emerged and rapidly spreading SARS-CoV-2 causes coronavirus disease 2019 (COVID-19). To facilitate a deeper understanding of the viral biology we developed a capture sequencing methodology to generate SARS-CoV-2 genomic and transcriptome sequences from infected patients. We utilized an oligonucleotide probe-set representing the full-length genome to obtain both genomic and transcriptome (subgenomic open reading frames [ORFs]) sequences from 45 SARS-CoV-2 clinical samples with varying viral titers. For samples with higher viral loads (cycle threshold value under 33, based on the CDC qPCR assay) complete genomes were generated. Analysis of junction reads revealed regions of differential transcriptional activity and provided evidence of expression of ORF10. Heterogeneous allelic frequencies along the 20kb ORF1ab gene suggested the presence of a defective interfering viral RNA species subpopulation in one sample. The associated workflow is straightforward, and hybridization-based capture offers an effective and scalable approach for sequencing SARS-CoV-2 from patient samples.

[1]  Edward C. Holmes,et al.  A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology , 2020, Nature Microbiology.

[2]  Yan Li,et al.  Rapid, Sensitive, Full-Genome Sequencing of Severe Acute Respiratory Syndrome Coronavirus 2 , 2020, Emerging infectious diseases.

[3]  S. Rowland-Jones,et al.  Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus , 2020, Cell.

[4]  S. Alexandersen,et al.  SARS-CoV-2 genomic and subgenomic RNAs in diagnostic samples are not an indicator of active replication , 2020, Nature Communications.

[5]  F. Balloux,et al.  Emergence of genomic diversity and recurrent mutations in SARS-CoV-2 , 2020, Infection, Genetics and Evolution.

[6]  M. Nguyen,et al.  Molecular Architecture of Early Dissemination and Evolution of the SARS-CoV-2 Virus in Metropolitan Houston, Texas , 2020, bioRxiv.

[7]  Sunando Roy,et al.  SARS-CoV-2 genomes recovered by long amplicon tiling multiplex approach using nanopore sequencing and applicable to other sequencing platforms , 2020, bioRxiv.

[8]  Neva C. Durand,et al.  A rapid, low-cost, and highly sensitive SARS-CoV-2 diagnostic based on whole-genome sequencing , 2020, bioRxiv.

[9]  Yan Li,et al.  Rapid, sensitive, full genome sequencing of Severe Acute Respiratory Syndrome Virus Coronavirus 2 (SARS-CoV-2) , 2020, bioRxiv.

[10]  Hyeshik Chang,et al.  The Architecture of SARS-CoV-2 Transcriptome , 2020, Cell.

[11]  M. Kuroda,et al.  A proposal of alternative primers for the ARTIC Network’s multiplex PCR to improve coverage of SARS-CoV-2 genome sequencing , 2020, bioRxiv.

[12]  Nichollas E. Scott,et al.  Direct RNA sequencing and early evolution of SARS-CoV-2 , 2020, bioRxiv.

[13]  E. Schröck,et al.  Targeted capture-based NGS is superior to multiplex PCR-based NGS for hereditary BRCA1 and BRCA2 gene analysis in FFPE tumor samples , 2019, BMC Cancer.

[14]  David M. Thomas,et al.  Diagnosis of fusion genes using targeted RNA sequencing , 2019, Nature Communications.

[15]  J. Schröder,et al.  Overview of Fusion Detection Strategies Using Next-Generation Sequencing. , 2019, Methods in molecular biology.

[16]  Nadim J Ajami,et al.  Maximal viral information recovery from sequence data using VirMAP , 2018, Nature Communications.

[17]  Siobain Duffy,et al.  Why are RNA virus mutation rates so damn high? , 2018, PLoS biology.

[18]  Jing Zhang,et al.  Comprehensive viral enrichment enables sensitive respiratory virus genomic identification and analysis by next generation sequencing , 2018, Genome research.

[19]  Mauricio O. Carneiro,et al.  Scaling accurate genetic variant discovery to tens of thousands of samples , 2017, bioRxiv.

[20]  S. Schuierer,et al.  A comprehensive assessment of RNA-seq protocols for degraded and low-quantity samples , 2017, BMC Genomics.

[21]  W. Lipkin,et al.  Virome Capture Sequencing Enables Sensitive Viral Diagnosis and Comprehensive Virome Analysis , 2015, mBio.

[22]  Javed Siddiqui,et al.  The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing , 2015, Genome research.

[23]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[24]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[25]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[26]  D. Brian,et al.  Subgenomic messenger RNA amplification in coronaviruses , 2010, Proceedings of the National Academy of Sciences.

[27]  J. Kitzman,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Whole exome capture in solution with 3Gbp of data , 2010 .

[28]  G. Weinstock,et al.  A SNP discovery method to assess variant allele probability from next-generation resequencing data. , 2010, Genome research.

[29]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[30]  G. Weinstock,et al.  Direct selection of human genomic loci by microarray hybridization , 2007, Nature Methods.

[31]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[32]  S. Makino,et al.  Enhanced Accumulation of Coronavirus Defective Interfering RNA from Expressed Negative-Strand Transcripts by Coexpressed Positive-Strand RNA Transcripts , 2001, Virology.