A new massively parallel nanoball sequencing platform for whole exome research

BackgroundWhole exome sequencing (WES) has been widely used in human genetics research. BGISEQ-500 is a recently established next-generation sequencing platform. However, the performance of BGISEQ-500 on WES is not well studied. In this study, we evaluated the performance of BGISEQ-500 on WES by side-to-side comparison with Hiseq4000, on well-characterized human sample NA12878.ResultsBGISEQ demonstrated similarly high reproducibility as Hiseq for variation detection. Also, the SNVs from BGISEQ data is highly consistent with Hiseq results (concordance 96.5%~ 97%). Variation detection accuracy was subsequently evaluated with data from the genome in a bottle project as the benchmark. Both platforms showed similar sensitivity and precision in SNV detection. While in indel detection, BGISEQ showed slightly higher sensitivity and lower precision. The impact of sequence depth and read length on variation detection accuracy was further analyzed, and showed that variation detection sensitivity still increasing when the sequence depth is larger than 100x, and the impact of read length is minor when using 100x data.ConclusionsThis study suggested that BGISEQ-500 is a qualified sequencing platform for WES.

[1]  B. Chain,et al.  The sequence of sequencers: The history of sequencing DNA , 2016, Genomics.

[2]  F. Sanger,et al.  DNA sequencing with chain-terminating inhibitors. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[4]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[5]  Robert B. Hartlage,et al.  This PDF file includes: Materials and Methods , 2009 .

[6]  Francisco M. De La Vega,et al.  Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. , 2009, Genome research.

[7]  Hui Jiang,et al.  A reference human genome dataset of the BGISEQ-500 sequencer , 2017, GigaScience.

[8]  Michael C. Schatz,et al.  Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score , 2012, Bioinform..

[9]  J. Zook,et al.  Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls , 2013, Nature Biotechnology.

[10]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[11]  Shyr Yu,et al.  Genome measures used for quality control are dependent on gene function and ancestry , 2015, Bioinform..

[12]  Bernard P. Puc,et al.  An integrated semiconductor device enabling non-optical genome sequencing , 2011, Nature.

[13]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[14]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[15]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[16]  Andrei L. Turinsky,et al.  The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection , 2015, Nucleic acids research.

[17]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[18]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .