High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients

Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA.

[1]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[2]  N. Rosenfeld,et al.  Noninvasive Identification and Monitoring of Cancer Mutations by Targeted Deep Sequencing of Plasma DNA , 2012, Science Translational Medicine.

[3]  Jay Shendure,et al.  Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation , 2013, Genome research.

[4]  H. Swerdlow,et al.  A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers , 2012, BMC Genomics.

[5]  Enzo Medico,et al.  Emergence of KRAS mutations and acquired resistance to anti-EGFR therapy in colorectal cancer , 2012, Nature.

[6]  Y. Kukita,et al.  Quantitative Identification of Mutant Alleles Derived from Lung Cancer in Plasma Cell-Free DNA via Anomaly Detection Using Deep Sequencing Data , 2013, PloS one.

[7]  K. Kinzler,et al.  Digital PCR. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. Goodman,et al.  Circulating mutant DNA to assess tumor dynamics , 2008, Nature Medicine.

[9]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[10]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[11]  N. Girard,et al.  Noninvasive Diagnosis of Actionable Mutations by Deep Sequencing of Circulating Free DNA in Lung Cancer from Never-Smokers: A Proof-of-Concept Study from BioCAST/IFCT-1002 , 2014, Clinical Cancer Research.

[12]  Tony Z. Jia,et al.  Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes , 2012, Proceedings of the National Academy of Sciences.

[13]  Kyoko Iwao-Koizumi,et al.  A novel technique for measuring variations in DNA copy-number: competitive genomic polymerase chain reaction , 2007, BMC Genomics.

[14]  Kikuya Kato,et al.  Quantitative Detection of EGFR Mutations in Circulating Tumor DNA Derived from Lung Adenocarcinomas , 2011, Clinical Cancer Research.

[15]  Jesse J. Salk,et al.  Detection of ultra-rare mutations by next-generation sequencing , 2012, Proceedings of the National Academy of Sciences.

[16]  William Pao,et al.  Using multiplexed assays of oncogenic drivers in lung cancers to select targeted drugs. , 2014, JAMA.

[17]  Y. Lo,et al.  Rapid clearance of fetal DNA from maternal plasma. , 1999, American journal of human genetics.

[18]  Kikuya Kato,et al.  Adaptor-tagged competitive PCR: a novel method for measuring relative gene expression. , 1997, Nucleic acids research.

[19]  K. Kato,et al.  Description of the entire mRNA population by a 3' end cDNA fragment generated by class IIS restriction enzymes. , 1995, Nucleic acids research.

[20]  Torunn I Yock,et al.  Ultrasensitive measurement of hotspot mutations in tumor DNA in blood using error-suppressed multiplexed deep sequencing. , 2012, Cancer research.

[21]  Ash A. Alizadeh,et al.  An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage , 2013, Nature Medicine.

[22]  Bert Vogelstein,et al.  DETECTION OF CIRCULATING TUMOR DNA IN EARLY AND LATE STAGE HUMAN MALIGNANCIES , 2014 .

[23]  Claus V. Hallwirth,et al.  Impact of next-generation sequencing error on analysis of barcoded plasmid libraries of known complexity and sequence , 2014, Nucleic acids research.

[24]  N. McGranahan,et al.  The causes and consequences of genetic heterogeneity in cancer evolution , 2013, Nature.

[25]  Christopher A. Miller,et al.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. , 2012, Genome research.

[26]  F. Real,et al.  Comparative analysis of mutations in the p53 and K‐ras genes in pancreatic cancer , 1994, International journal of cancer.

[27]  J. Minna,et al.  Genetic alteration of the β-catenin gene (CTNNB1) in human lung cancer and malignant mesothelioma and identification of a new 3p21.3 homozygous deletion , 2001, Oncogene.

[28]  Carlos Caldas,et al.  Analysis of circulating tumor DNA to monitor metastatic breast cancer. , 2013, The New England journal of medicine.

[29]  K. Kinzler,et al.  Detection and quantification of rare mutations with massively parallel sequencing , 2011, Proceedings of the National Academy of Sciences.

[30]  Johannes G. Reiter,et al.  The molecular evolution of acquired resistance to targeted EGFR blockade in colorectal cancers , 2012, Nature.

[31]  James A. Casbon,et al.  A method for counting PCR template molecules with application to next-generation sequencing , 2011, Nucleic acids research.

[32]  S. Begum,et al.  Sequence Alignment , 2018, Beginners Guide to Bioinformatics for High Throughput Sequencing.

[33]  Daniel F. Hayes,et al.  Analysis of Circulating Tumor DNA to Monitor Metastatic Breast Cancer , 2013 .

[34]  R. Knight,et al.  Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex , 2008, Nature Methods.

[35]  Shinzaburo Noguchi,et al.  Cancer gene expression database (CGED): a database for gene expression profiling with accompanying clinical information of human cancer tissues , 2004, Nucleic Acids Res..

[36]  Jens Stoye,et al.  Updating benchtop sequencing performance comparison , 2013, Nature Biotechnology.