Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing

Haplotype-resolved genome sequencing enables the accurate interpretation of medically relevant genetic variation, deep inferences regarding population history and non-invasive prediction of fetal genomes. We describe an approach for genome-wide haplotyping based on contiguity-preserving transposition (CPT-seq) and combinatorial indexing. Tn5 transposition is used to modify DNA with adaptor and index sequences while preserving contiguity. After DNA dilution and compartmentalization, the transposase is removed, resolving the DNA into individually indexed libraries. The libraries in each compartment, enriched for neighboring genomic elements, are further indexed via PCR. Combinatorial 96-plex indexing at both the transposition and PCR stage enables the construction of phased synthetic reads from each of the nearly 10,000 'virtual compartments'. We demonstrate the feasibility of this method by assembling >95% of the heterozygous variants in a human genome into long, accurate haplotype blocks (N50 = 1.4–2.3 Mb). The rapid, scalable and cost-effective workflow could enable haplotype resolution to become routine in human genome sequencing.

[1]  S. Tishkoff,et al.  Global Patterns of Linkage Disequilibrium at the CD4 Locus and Modern Human Origins , 1996, Science.

[2]  S. Quake,et al.  A microfabricated fluorescence-activated cell sorter , 1999, Nature Biotechnology.

[3]  Pardis C Sabeti,et al.  Detecting recent positive selection in the human genome from haplotype structure , 2002, Nature.

[4]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[5]  A. Halpern,et al.  An MCMC algorithm for haplotype assembly from whole-genome sequence data. , 2008, Genome research.

[6]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[7]  Pall I. Olason,et al.  Detection of sharing by descent, long-range phasing and haplotype imputation , 2008, Nature Genetics.

[8]  G. Hannon,et al.  DNA Sudoku--harnessing high-throughput sequencing for multiplexed specimen analysis. , 2009, Genome research.

[9]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[10]  Eleazar Eskin,et al.  Optimal algorithms for haplotype assembly from whole-genome sequence data , 2010, Bioinform..

[11]  T. G. Mitchell,et al.  Multiplexed real-time polymerase chain reaction on a digital microfluidic platform. , 2010, Analytical chemistry.

[12]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[13]  Andrew C. Adey,et al.  Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition , 2010, Genome Biology.

[14]  Filippo Geraci,et al.  A comparison of several algorithms for the single individual SNP haplotyping reconstruction problem , 2010, Bioinform..

[15]  Haiying Li Grunenwald,et al.  DNA Library Preparation: Simultaneous DNA Fragmentation and Adaptor Tagging by In Vitro Transposition. , 2010 .

[16]  Stephen R Quake,et al.  Whole-genome molecular haplotyping of single cells , 2011, Nature Biotechnology.

[17]  N. Caruccio Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition. , 2011, Methods in molecular biology.

[18]  Andrew C. Adey,et al.  Haplotype-resolved genome sequencing of a Gujarati Indian individual , 2011, Nature Biotechnology.

[19]  V. Bansal,et al.  The importance of phase information for human genomics , 2011, Nature Reviews Genetics.

[20]  Ali Bashir,et al.  Strobe sequence design for haplotype assembly , 2011, BMC Bioinformatics.

[21]  Katja Nowick,et al.  A comprehensively molecular haplotype-resolved genome of a European individual. , 2011, Genome research.

[22]  M. DePristo,et al.  Variation in genome-wide mutation rates within and between human families , 2011, Nature Genetics.

[23]  B. Browning,et al.  Haplotype phasing: existing methods and new developments , 2011, Nature Reviews Genetics.

[24]  V. Bansal,et al.  The next phase in human genetics , 2011, Nature Biotechnology.

[25]  Jay Shendure,et al.  Noninvasive Whole-Genome Sequencing of a Human Fetus , 2012, Science Translational Medicine.

[26]  Jessica C. Ebert,et al.  Accurate whole genome sequencing and haplotyping from10-20 human cells , 2012, Nature.

[27]  K. Verstrepen,et al.  Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques , 2011, Nucleic acids research.

[28]  H. C. Fan,et al.  Noninvasive Prenatal Measurement of the Fetal Genome , 2012, Nature.

[29]  Mostafa Ronaghi,et al.  Whole-genome haplotyping by dilution, amplification, and sequencing , 2013, Proceedings of the National Academy of Sciences.

[30]  Jay Shendure,et al.  The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line , 2013, Nature.

[31]  K. Robasky,et al.  On the design of clone-based haplotyping , 2013, Genome Biology.

[32]  Ituro Inoue,et al.  Phase-defined complete sequencing of the HLA genes by next-generation sequencing , 2013, BMC Genomics.

[33]  Tom Kamphans,et al.  Filtering for Compound Heterozygous Sequence Variants in Non-Consanguineous Pedigrees , 2013, PloS one.

[34]  Andrew C. Adey,et al.  In vitro, long-range sequence information for de novo genome assembly via transposase contiguity , 2014, Genome research.

[35]  Dmitry Pushkarev,et al.  Whole-genome haplotyping using long reads and statistical methods , 2014, Nature Biotechnology.