Mate-pair Library Construction with Controlled Polymerization Enables Comprehensive Structural Rearrangement Detection

It is important, yet difficult, to identify genomic structural rearrangements associated with congenital diseases or tumors. Mate-pair sequencing enables the positioning of a long DNA fragment with complete and precise breakpoints and has therefore become a common diagnostic approach for identifying chromosomal aberrations. Several methods are currently used for detection. However, due to cost, the need for large input quantity, and operation complexity, existing workflows are unsuitable for large-scale clinical studies. Herein, we describe a new process that couples advanced controlled polymerization with a non-conventional adapter ligation to generate mate-pairs with desirable length that yield minimal GC bias and improved coverage uniformity. Compared to other methods, our strategy can achieve 8-fold improved DNA circularization efficiency, a 39.3-fold reduction of read-pairs that do not cross the circularization junction, and the lowest chimeric rate, collectively producing an ∼50% increase of physical coverage. In a proof-of-concept study using five insertion translocations, the structural rearrangements were comprehensively detected using longer 100-bp reads enabled by this approach. Based on its ability to identify single-nucleotide-resolution changes, this approach shows promise as an integrated method for the comprehensive detection of genomic variants at a fraction of current cost.

[1]  Edwin Cuppen,et al.  Mate pair sequencing for the detection of chromosomal aberrations in patients with intellectual disability and congenital malformations , 2013, European Journal of Human Genetics.

[2]  Ryan L. Collins,et al.  Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome , 2017, Genome Biology.

[3]  Hui Jiang,et al.  Balanced Chromosomal Rearrangement Detection by Low‐Pass Whole‐Genome Sequencing , 2018, Current protocols in human genetics.

[4]  Martin Vingron,et al.  Mapping translocation breakpoints by next-generation sequencing. , 2008, Genome research.

[5]  Isaac Y. Ho,et al.  Meraculous: De Novo Genome Assembly with Short Paired-End Reads , 2011, PloS one.

[6]  J. Lupski,et al.  Mechanisms underlying structural variant formation in genomic disorders , 2016, Nature Reviews Genetics.

[7]  Åsa K. Björklund,et al.  Tn5 transposase and tagmentation procedures for massively scaled sequencing projects , 2014, Genome research.

[8]  Steven M. Johnson,et al.  A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. , 2008, Genome research.

[9]  A. Alexeev,et al.  cPAS-based sequencing on the BGISEQ-500 to explore small non-coding RNAs , 2016, Clinical Epigenetics.

[10]  M. Talkowski,et al.  Design of Large‐Insert Jumping Libraries for Structural Variant Detection Using Illumina Sequencing , 2014, Current protocols in human genetics.

[11]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[12]  Edwin Cuppen,et al.  Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing , 2013, BMC Genomics.

[13]  Ou Wang,et al.  3’ Branch Ligation: A Novel Method to Ligate Non-Complementary DNA to Recessed or Internal 3’OH Ends in DNA or RNA , 2018, bioRxiv.

[14]  Jun Zhang,et al.  Low-pass whole-genome sequencing in clinical cytogenetics: a validated approach , 2016, Genetics in Medicine.

[15]  Wentian Li,et al.  Mappability and read length , 2014, Front. Genet..

[16]  Alexa B. R. McIntyre,et al.  Extensive sequencing of seven human genomes to characterize benchmark reference materials , 2015, Scientific Data.

[17]  Michael E Talkowski,et al.  Clinical diagnosis by whole-genome sequencing of a prenatal sample. , 2012, The New England journal of medicine.

[18]  Donna M. Muzny,et al.  SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads , 2017, BMC Genomics.

[19]  Yiping Shen,et al.  Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research. , 2011, American journal of human genetics.

[20]  Bernardo J. Clavijo,et al.  A method to simultaneously construct up to 12 differently sized Illumina Nextera long mate pair libraries with reduced DNA input, time, and cost. , 2015, BioTechniques.

[21]  Jessica C. Ebert,et al.  Computational Techniques for Human Genome Resequencing Using Mated Gapped Reads , 2012, J. Comput. Biol..

[22]  M. K. Rudd,et al.  Unbalanced translocations arise from diverse mutational mechanisms including chromothripsis , 2015, Genome research.

[23]  Hui Jiang,et al.  Identification of Balanced Chromosomal Rearrangements Previously Unknown Among Participants in the 1000 Genomes Project: Implications for Interpretation of Structural Variation in Genomes and the Future of Clinical Cytogenetics , 2017, Genetics in Medicine.

[24]  Gabor T. Marth,et al.  An integrated map of structural variation in 2,504 human genomes , 2015, Nature.

[25]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[26]  Jian Wang,et al.  De novo assembly of a haplotype-resolved human genome , 2015, Nature Biotechnology.

[27]  Richard M. Leggett,et al.  NextClip: an analysis and read preparation tool for Nextera Long Mate Pair libraries , 2013, Bioinform..

[28]  Bernard P. Puc,et al.  An integrated semiconductor device enabling non-optical genome sequencing , 2011, Nature.

[29]  Xun Xu,et al.  A Robust Approach for Blind Detection of Balanced Chromosomal Rearrangements with Whole‐Genome Low‐Coverage Sequencing , 2014, Human mutation.

[30]  Ole Schulz-Trieglaff,et al.  Distributed under Creative Commons Cc-by 4.0 Nxrepair: Error Correction in De Novo Sequence Assembly Using Nextera Mate Pairs , 2022 .

[31]  Shawn W. Polson,et al.  Evaluation of a Transposase Protocol for Rapid Generation of Shotgun High-Throughput Sequencing Libraries from Nanogram Quantities of DNA , 2011, Applied and Environmental Microbiology.

[32]  Daniel Nilsson,et al.  Whole‐Genome Sequencing of Cytogenetically Balanced Chromosome Translocations Identifies Potentially Pathological Gene Disruptions and Highlights the Importance of Microhomology in the Mechanism of Formation , 2017, Human mutation.

[33]  Dieter Deforce,et al.  Illumina mate-paired DNA sequencing-library preparation using Cre-Lox recombination , 2011, Nucleic acids research.

[34]  M. Babu,et al.  Discovering and understanding oncogenic gene fusions through data intensive computational approaches , 2016, Nucleic acids research.

[35]  Robert B. Hartlage,et al.  This PDF file includes: Materials and Methods , 2009 .

[36]  Jessica C. Ebert,et al.  Accurate whole genome sequencing and haplotyping from10-20 human cells , 2012, Nature.

[37]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[38]  Edwin Cuppen,et al.  The genomic landscape of balanced cytogenetic abnormalities associated with human congenital anomalies , 2016, Nature Genetics.