Facile single-stranded DNA sequencing of human plasma DNA via thermostable group II intron reverse transcriptase template switching

High-throughput single-stranded DNA sequencing (ssDNA-seq) of cell-free DNA from plasma and other bodily fluids is a powerful method for non-invasive prenatal testing, and diagnosis of cancers and other diseases. Here, we developed a facile ssDNA-seq method, which exploits a novel template-switching activity of thermostable group II intron reverse transcriptases (TGIRTs) for DNA-seq library construction. This activity enables TGIRT enzymes to initiate DNA synthesis directly at the 3′ end of a DNA strand while simultaneously attaching a DNA-seq adapter without end repair, tailing, or ligation. Initial experiments using this method to sequence E. coli genomic DNA showed that the TGIRT enzyme has surprisingly robust DNA polymerase activity. Further experiments showed that TGIRT-seq of plasma DNA from a healthy individual enables analysis of nucleosome positioning, transcription factor-binding sites, DNA methylation sites, and tissues-of-origin comparably to established methods, but with a simpler workflow that captures precise DNA ends.

[1]  F. O. Fackelmayer,et al.  DNA fragments in the blood plasma of cancer patients: quantitations and evidence for their origin from apoptotic and necrotic cells. , 2001, Cancer research.

[2]  Christoph Grunau,et al.  Bisulfite genomic sequencing: systematic investigation of critical experimental parameters , 2001, Nucleic Acids Res..

[3]  Éric Renault,et al.  MethDB - a public database for DNA methylation data , 2001, Nucleic Acids Res..

[4]  M. Belfort,et al.  Recruitment of host functions suggests a repair pathway for late steps in group II intron retrohoming. , 2005, Genes & development.

[5]  D. Pisetsky,et al.  The role of macrophages in the in vitro generation of extracellular DNA from apoptotic and necrotic cells , 2005, Immunology.

[6]  I. Grivicich,et al.  Role of plasma DNA as a predictive marker of fatal outcome following severe head injury in males. , 2007, Journal of neurotrauma.

[7]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[8]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[9]  H. C. Fan,et al.  Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood , 2008, Proceedings of the National Academy of Sciences.

[10]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[11]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[12]  Pedro M. Valero-Mora,et al.  ggplot2: Elegant Graphics for Data Analysis , 2010 .

[13]  Yama W. L. Zheng,et al.  Maternal Plasma DNA Sequencing Reveals the Genome-Wide Genetic and Mutational Profile of the Fetus , 2010, Science Translational Medicine.

[14]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[15]  Travis White Group II intron and gene targeting reactions in Drosophila melanogaster , 2011 .

[16]  K. Kinzler,et al.  Detection and quantification of rare mutations with massively parallel sequencing , 2011, Proceedings of the National Academy of Sciences.

[17]  Brent S. Pedersen,et al.  Pybedtools: a flexible Python library for manipulating genomic datasets and annotations , 2011, Bioinform..

[18]  T. Glenn Field guide to next‐generation DNA sequencers , 2011, Molecular ecology resources.

[19]  G. Parmigiani,et al.  Detection of Chromosomal Alterations in the Circulation of Cancer Patients with Whole-Genome Sequencing , 2012, Science Translational Medicine.

[20]  D. Cook,et al.  ggbio: an R package for extending the grammar of graphics for genomic data , 2012, Genome Biology.

[21]  N. Kyrpides,et al.  Direct Comparisons of Illumina vs. Roche 454 Sequencing Technologies on the Same Microbial Community DNA Sample , 2012, PloS one.

[22]  N. Lennon,et al.  Characterizing and measuring bias in sequence data , 2013, Genome Biology.

[23]  Daniel J. Gaffney,et al.  Controls of Nucleosome Positioning in the Human Genome , 2012, PLoS genetics.

[24]  F. Syed,et al.  EpiGnome™ Methyl-Seq Kit: a novel post–bisulfite conversion library prep method for methylation analysis , 2013, Nature Methods.

[25]  N. Rosenfeld,et al.  Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA , 2013, Nature.

[26]  A. Lambowitz,et al.  Genetic and Biochemical Assays Reveal a Key Role for Replication Restart Proteins in Group II Intron Retrohoming , 2013, PLoS genetics.

[27]  V. Iyer,et al.  Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing , 2013, RNA.

[28]  M. Meyer,et al.  Single-stranded DNA library preparation for the sequencing of ancient or damaged DNA , 2013, Nature Protocols.

[29]  Yiliang Ding,et al.  A hybridization-based approach for quantitative and low-bias single-stranded DNA ligation. , 2013, Analytical biochemistry.

[30]  Jeffrey A. Hussmann,et al.  High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing , 2013, Proceedings of the National Academy of Sciences.

[31]  N. Neff,et al.  Temporal Response of the Human Virome to Immunosuppression and Antiviral Therapy , 2013, Cell.

[32]  Brent S. Pedersen,et al.  Fast and accurate alignment of long bisulfite-seq reads , 2014, 1401.1129.

[33]  A. Quinlan BEDTools: The Swiss‐Army Tool for Genome Feature Analysis , 2014, Current protocols in bioinformatics.

[34]  R. Spriggs,et al.  Evaluating bias-reducing protocols for RNA sequencing library preparation , 2014, BMC Genomics.

[35]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[36]  Ash A. Alizadeh,et al.  An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage , 2013, Nature Medicine.

[37]  Brendan F. Kohrn,et al.  Detecting ultralow-frequency mutations by Duplex Sequencing , 2014, Nature Protocols.

[38]  S. Linnarsson,et al.  Amplification-free sequencing of cell-free DNA for prenatal non-invasive diagnosis of chromosomal aberrations. , 2015, Genomics.

[39]  Yue Hu,et al.  Non-invasive Analysis of Genomic Copy Number Variation in Patients with Hepatocellular Carcinoma by Next Generation DNA Sequencing , 2015, Journal of Cancer.

[40]  Nicholas J. Wang,et al.  Exome Sequencing of Cell-Free DNA from Metastatic Cancer Patients Identifies Clinically Actionable Mutations Distinct from Primary Disease , 2015, PloS one.

[41]  M. Belfort,et al.  Mobile Bacterial Group II Introns at the Crux of Eukaryotic Evolution. , 2015, Microbiology spectrum.

[42]  Brent S. Pedersen,et al.  Efficient "pythonic" access to FASTA files using pyfaidx , 2015, PeerJ Prepr..

[43]  N. Thorne,et al.  High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA , 2015, BMC Medical Genomics.

[44]  R. Strausberg,et al.  Circulating tumor DNA as an early marker of therapeutic response in patients with metastatic colorectal cancer. , 2015, Annals of oncology : official journal of the European Society for Medical Oncology.

[45]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[46]  V. Wong,et al.  Lengthening and shortening of plasma DNA in hepatocellular carcinoma patients , 2015, Proceedings of the National Academy of Sciences.

[47]  E. Ma,et al.  Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments , 2015, Proceedings of the National Academy of Sciences.

[48]  C. Ling,et al.  Genome-wide analysis of DNA methylation in subjects with type 1 diabetes identifies epigenetic modifications associated with proliferative diabetic retinopathy , 2015, BMC Medicine.

[49]  R. Price,et al.  Artemether-lumefantrine treatment of uncomplicated Plasmodium falciparum malaria: a systematic review and meta-analysis of day 7 lumefantrine concentrations and therapeutic response using individual patient data , 2015, BMC Medicine.

[50]  Min Seong Kim,et al.  Single-stranded DNA library preparation uncovers the origin and diversity of ultrashort cell-free DNA in plasma , 2015, Scientific Reports.

[51]  A. Lambowitz,et al.  RNA-seq of human reference RNA samples using a thermostable group II intron reverse transcriptase , 2016, RNA.

[52]  Matthew W. Snyder,et al.  Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin , 2016, Cell.

[53]  A. Lambowitz,et al.  High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases , 2016, RNA.

[54]  Ash A. Alizadeh,et al.  Integrated digital error suppression for improved detection of circulating tumor DNA , 2016, Nature Biotechnology.

[55]  R. Strausberg,et al.  Circulating tumor DNA analysis detects minimal residual disease and predicts recurrence in patients with stage II colon cancer , 2016, Science Translational Medicine.

[56]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[57]  Alon Goren,et al.  Biases in the SMART-DNA library preparation method associated with genomic poly dA/dT sequences , 2017, PloS one.

[58]  R. Batey,et al.  Recurrent RNA motifs as scaffolds for genetically encodable small molecule biosensors , 2016, Nature chemical biology.

[59]  J. Weissman,et al.  DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo , 2016, Nature Methods.

[60]  M. Meyer,et al.  Single-stranded DNA library preparation from highly degraded DNA using T4 DNA ligase , 2017, Nucleic acids research.

[61]  R. Schekman,et al.  A broad role for YBX1 in defining the small non-coding RNA composition of exosomes , 2017, bioRxiv.