Unique, dual-indexed sequencing adapters with UMIs effectively eliminate index cross-talk and significantly improve sensitivity of massively parallel sequencing

BackgroundSample index cross-talk can result in false positive calls when massively parallel sequencing (MPS) is used for sensitive applications such as low-frequency somatic variant discovery, ancient DNA investigations, microbial detection in human samples, or circulating cell-free tumor DNA (ctDNA) variant detection. Therefore, the limit-of-detection of an MPS assay is directly related to the degree of index cross-talk.ResultsCross-talk rates up to 0.29% were observed when using standard, combinatorial adapters, resulting in 110,180 (0.1% cross-talk rate) or 1,121,074 (0.29% cross-talk rate) misassigned reads per lane in non-patterned and patterned Illumina flow cells, respectively. Here, we demonstrate that using unique, dual-matched indexed adapters dramatically reduces index cross-talk to ≤1 misassigned reads per flow cell lane. While the current study was performed using dual-matched indices, using unique, dual-unrelated indices would also be an effective alternative.ConclusionsFor sensitive downstream analyses, the use of combinatorial indices for multiplexed hybrid capture and sequencing is inappropriate, as it results in an unacceptable number of misassigned reads. Cross-talk can be virtually eliminated using dual-matched indexed adapters. These results suggest that use of such adapters is critical to reduce false positive rates in assays that aim to identify low allele frequency events, and strongly indicate that dual-matched adapters be implemented for all sensitive MPS applications.

[1]  A. Meyerhans,et al.  DNA recombination during PCR. , 1990, Nucleic acids research.

[2]  R. Knight,et al.  Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex , 2008, Nature Methods.

[3]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[4]  Anne E Carpenter,et al.  Visualization of image data from cells to organisms , 2010, Nature Methods.

[5]  Lee T. Sam,et al.  Personalized Oncology Through Integrative High-Throughput Sequencing: A Pilot Study , 2011, Science Translational Medicine.

[6]  Jussi Taipale,et al.  Counting absolute number of molecules using unique molecular identifiers , 2011 .

[7]  S. Linnarsson,et al.  Counting absolute numbers of molecules using unique molecular identifiers , 2011, Nature Methods.

[8]  B. Faircloth,et al.  Not All Sequence Tags Are Created Equal: Designing and Validating Sequence Identification Tags Robust to Indels , 2012, PloS one.

[9]  Martin Kircher,et al.  Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform , 2011, Nucleic acids research.

[10]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[11]  Trevor J Pugh,et al.  Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation , 2013, Nucleic acids research.

[12]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[13]  K. Robasky,et al.  The role of replicates for error mitigation in next-generation sequencing , 2013, Nature Reviews Genetics.

[14]  Benjamin J. Raphael,et al.  Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin , 2014, Cell.

[15]  Sharon L. Grim,et al.  Analysis, Optimization and Verification of Illumina-Generated 16S rRNA Gene Amplicon Surveys , 2014, PloS one.

[16]  M. Rowicka,et al.  Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform , 2015, PloS one.

[17]  John G Kenny,et al.  A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling , 2016, BMC Genomics.

[18]  Vladimir Vacic,et al.  Conpair: concordance and contamination estimator for matched tumor–normal pairs , 2016, Bioinform..

[19]  E. Wright,et al.  Quality filtering of Illumina index reads mitigates sample cross-talk , 2016, BMC Genomics.

[20]  Marian Harris,et al.  Institutional implementation of clinical tumor profiling on an unselected cancer population. , 2016, JCI insight.

[21]  I. Weissman,et al.  Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing , 2017, bioRxiv.

[22]  David R Williams,et al.  Comparison of Prevalence and Types of Mutations in Lung Cancers Among Black and White Populations , 2017, JAMA oncology.

[23]  Faraz Hach,et al.  SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA , 2017, Bioinform..