The Impact of DNA Input Amount and DNA Source on the Performance of Whole-Exome Sequencing in Cancer Epidemiology

Background: Whole-exome sequencing (WES) has recently emerged as an appealing approach to systematically study coding variants. However, the requirement for a large amount of high-quality DNA poses a barrier that may limit its application in large cancer epidemiologic studies. We evaluated the performance of WES with low input amount and saliva DNA as an alternative source material. Methods: Five breast cancer patients were randomly selected from the Pathways Study. From each patient, four samples, including 3 μg, 1 μg, and 0.2 μg blood DNA and 1 μg saliva DNA, were aliquoted for library preparation using the Agilent SureSelect Kit and sequencing using Illumina HiSeq2500. Quality metrics of sequencing and variant calling, as well as concordance of variant calls from the whole exome and 21 known breast cancer genes, were assessed by input amount and DNA source. Results: There was little difference by input amount or DNA source on the quality of sequencing and variant calling. The concordance rate was about 98% for single-nucleotide variant calls and 83% to 86% for short insertion/deletion calls. For the 21 known breast cancer genes, WES based on low input amount and saliva DNA identified the same set variants in samples from a same patient. Conclusions: Low DNA input amount, as well as saliva DNA, can be used to generate WES data of satisfactory quality. Impact: Our findings support the expansion of WES applications in cancer epidemiologic studies where only low DNA amount or saliva samples are available. Cancer Epidemiol Biomarkers Prev; 24(8); 1207–13. ©2015 AACR.

[1]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[2]  Á. Carracedo,et al.  Whole-exome sequencing identifies rare pathogenic variants in new predisposition genes for familial colorectal cancer , 2014, Genetics in Medicine.

[3]  M. Schatz,et al.  Reducing INDEL calling errors in whole genome and exome sequencing data , 2014, Genome Medicine.

[4]  R. Lyle,et al.  Identification of copy number variants from exome sequence data , 2014, BMC Genomics.

[5]  Chun Li,et al.  Improved Variant Calling Accuracy by Merging Replicates in Whole-Exome Sequencing Studies , 2014, BioMed research international.

[6]  Xiaolin Zhu,et al.  An Evaluation of Copy Number Variation Detection Tools from Whole‐Exome Sequencing Data , 2014, Human mutation.

[7]  J. Zook,et al.  Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls , 2013, Nature Biotechnology.

[8]  K. J. Johansen Taber,et al.  The promise and challenges of next-generation genome sequencing for clinical care. , 2014, JAMA internal medicine.

[9]  Mustafa Tekin,et al.  The promise of whole-exome sequencing in medical genetics , 2013, Journal of Human Genetics.

[10]  Magalie S Leduc,et al.  Clinical whole-exome sequencing for the diagnosis of mendelian disorders. , 2013, The New England journal of medicine.

[11]  J. Shendure,et al.  Germline Missense Variants in the BTNL2 Gene Are Associated with Prostate Cancer Susceptibility , 2013, Cancer Epidemiology, Biomarkers & Prevention.

[12]  Levi A Garraway,et al.  Genomics-driven oncology: framework for an emerging paradigm. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[13]  Mark Gerstein,et al.  The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes , 2013, Genome research.

[14]  J. Benítez,et al.  Whole Exome Sequencing Suggests Much of Non-BRCA1/BRCA2 Familial Breast Cancer Is Due to Moderate and Low Penetrance Susceptibility Alleles , 2013, PloS one.

[15]  Chao Chen,et al.  dbVar and DGVa: public archives for genomic structural variation , 2012, Nucleic Acids Res..

[16]  G. Highnam,et al.  Accurate human microsatellite genotypes from high-throughput resequencing data using informed error profiles , 2012, Nucleic acids research.

[17]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[18]  Q. Hu,et al.  OSAT: a tool for sample-to-batch allocations in genomics experiments , 2012, BMC Genomics.

[19]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[20]  Rong Chen,et al.  Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation. , 2012, American journal of human genetics.

[21]  Hugo Y. K. Lam,et al.  Performance comparison of exome DNA sequencing technologies , 2011, Nature Biotechnology.

[22]  Heikki Joensuu,et al.  Comparison of solution-based exome capture methods for next generation sequencing , 2011, Genome Biology.

[23]  Hui Jiang,et al.  Comprehensive comparison of three commercial human whole-exome capture platforms , 2011, Genome Biology.

[24]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[25]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[26]  James C. Mullikin,et al.  Exome sequencing: the sweet spot before whole genomes , 2010, Human molecular genetics.

[27]  P. Shannon,et al.  Exome sequencing identifies the cause of a Mendelian disorder , 2009, Nature Genetics.

[28]  Francisco M. De La Vega,et al.  Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. , 2009, Genome research.

[29]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[30]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[31]  I. Ionita-Laza,et al.  Estimating the number of unseen variants in the human genome , 2009, Proceedings of the National Academy of Sciences.

[32]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[33]  B. Sternfeld,et al.  The Pathways Study: a prospective study of breast cancer survivorship within Kaiser Permanente Northern California , 2008, Cancer Causes & Control.

[34]  J. Lupski,et al.  The complete genome of an individual by massively parallel DNA sequencing , 2008, Nature.

[35]  M. Nesline,et al.  Establishing a Cancer Center Data Bank and Biorepository for Multidisciplinary Research , 2006, Cancer Epidemiology Biomarkers & Prevention.