Targeted Single Primer Enrichment Sequencing with Single End Duplex-UMI

For specific detection of somatic variants at very low levels, artifacts from the NGS workflow have to be eliminated. Various approaches using unique molecular identifiers (UMI) to analytically remove NGS artifacts have been described. Among them, Duplex-seq was shown to be highly effective, by leveraging the sequence complementarity of two DNA strands. However, all of the published Duplex-seq implementations so far required pair-end sequencing and in the case of combining duplex sequencing with target enrichment, lengthy hybridization enrichment was required. We developed a simple protocol, which enabled the retrieval of duplex UMI in multiplex PCR based enrichment and sequencing. Using this protocol and reference materials, we demonstrated the accurate detection of known SNVs at 0.1–0.2% allele fractions, aided by duplex UMI. We also observed that low level base substitution artifacts could be introduced when preparing in vitro DNA reference materials, which could limit their utility as a benchmarking tool for variant detection at very low levels. Our new targeted sequencing method offers the benefit of using duplex UMI to remove NGS artifacts in a much more simplified workflow than existing targeted duplex sequencing methods.

[1]  C. Lam,et al.  Predominant hematopoietic origin of cell-free DNA in plasma and serum after sex-mismatched bone marrow transplantation. , 2002, Clinical chemistry.

[2]  Mikhail Shugay,et al.  Towards error-free profiling of immune repertoires , 2014, Nature Methods.

[3]  R. Tyrrell,et al.  Artificial background and induced levels of oxidative base damage in DNA from human cells. , 1997, Carcinogenesis.

[4]  B. Kermani,et al.  Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA , 2015, PloS one.

[5]  smCounter2: an accurate low-frequency variant caller for targeted sequencing data with unique molecular identifiers , 2018 .

[6]  Martin E. Gleave,et al.  Androgen Receptor Gene Aberrations in Circulating Cell-Free DNA: Biomarkers of Therapeutic Resistance in Castration-Resistant Prostate Cancer , 2015, Clinical Cancer Research.

[7]  Jay Shendure,et al.  Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation , 2013, Genome research.

[8]  Elizabeth K. Schmidt,et al.  Targeted genome fragmentation with CRISPR/Cas9 enables fast and efficient enrichment of small genomic regions and ultra-accurate sequencing with low DNA input (CRISPR-DS) , 2018, Genome research.

[9]  Trevor J Pugh,et al.  Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation , 2013, Nucleic acids research.

[10]  J. Maguire,et al.  Solution Hybrid Selection with Ultra-long Oligonucleotides for Massively Parallel Targeted Sequencing , 2009, Nature Biotechnology.

[11]  Chang Xu,et al.  Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller , 2017, BMC Genomics.

[12]  A. Iafrate,et al.  Anchored multiplex PCR for targeted next-generation sequencing , 2014, Nature Medicine.

[13]  Hyun-Tae Shin,et al.  Characterization of background noise in capture-based targeted sequencing data , 2017, Genome Biology.

[14]  V. Wong,et al.  Preferred end coordinates and somatic variants as signatures of circulating tumor DNA associated with hepatocellular carcinoma , 2018, Proceedings of the National Academy of Sciences.

[15]  Jesse J. Salk,et al.  Detection of ultra-rare mutations by next-generation sequencing , 2012, Proceedings of the National Academy of Sciences.

[16]  Emily H Turner,et al.  Target-enrichment strategies for next-generation sequencing , 2010, Nature Methods.

[17]  Ryan D. Morin,et al.  Cell-free DNA (cfDNA): Clinical Significance and Utility in Cancer Shaped By Emerging Technologies , 2016, Molecular Cancer Research.

[18]  M. Delignette-Muller,et al.  fitdistrplus: An R Package for Fitting Distributions , 2015 .

[19]  Pingfang Liu,et al.  DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification , 2017, Science.

[20]  Lawrence D True,et al.  Sequencing small genomic targets with high efficiency and extreme accuracy , 2015, Nature Methods.

[21]  M. Dimon,et al.  An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples , 2015, PloS one.

[22]  Ash A. Alizadeh,et al.  Integrated digital error suppression for improved detection of circulating tumor DNA , 2016, Nature Biotechnology.

[23]  James A. Casbon,et al.  A method for counting PCR template molecules with application to next-generation sequencing , 2011, Nucleic acids research.

[24]  K. Kinzler,et al.  Detection and quantification of rare mutations with massively parallel sequencing , 2011, Proceedings of the National Academy of Sciences.

[25]  Matthew W. Snyder,et al.  Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin , 2016, Cell.