Linked-read sequencing enables haplotype-resolved resequencing at population scale

The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences – including the quantification and dating of admixture, introgression and demographic events, and the inference of selective sweeps – are still limited by the lack of high-quality haplotype information. In this respect, the newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25x, 20x, 15x, 10x, 7x, and 5x) with high-coverage data (46-68x) of seven bird genomes suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15x coverage, phased haplotypes span about 90% of the genome assembly, with 50 and 90 percent of the phased sequence located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90), respectively. Phasing accuracy reaches beyond 99% starting from 15x coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1Mb (N50/N90) at 25x coverage), but only marginally improved phasing accuracy. Finally, phasing contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing data at population scale.

[1]  Sergey Koren,et al.  HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads , 2020, bioRxiv.

[2]  L. Bernatchez,et al.  Using Haplotype Information for Conservation Genomics. , 2019, Trends in ecology & evolution.

[3]  Max Käller,et al.  High throughput barcoding method for genome-scale phasing , 2019, Scientific Reports.

[4]  Yong Wang,et al.  Ultra-low input single tube linked-read library method enables short-read NGS systems to generate highly accurate and economical long-range sequencing information for de novo genome assembly and haplotype phasing , 2019, bioRxiv.

[5]  Fritz J. Sedlazeck,et al.  The population genomics of structural variation in a songbird genus , 2019, bioRxiv.

[6]  Matthew R. Robinson,et al.  Accurate, scalable and integrative haplotype estimation , 2019, Nature Communications.

[7]  Karin M. Verspoor,et al.  Exploring effective approaches for haplotype block phasing , 2019, BMC Bioinform..

[8]  Genome-wide evidence supports mitochondrial relationships and pervasive parallel phenotypic evolution in open-habitat chats. , 2019, Molecular phylogenetics and evolution.

[9]  Russell B. Corbett-Detig,et al.  On the Distribution of Tract Lengths During Adaptive Introgression , 2019, bioRxiv.

[10]  F. Bonhomme,et al.  The spatial scale of dispersal revealed by admixture tracts , 2019, Evolutionary applications.

[11]  Benedict Paten,et al.  Haplotype-aware diplotyping from noisy long reads , 2019, Genome Biology.

[12]  H. Ellegren,et al.  Footprints of adaptive evolution revealed by whole Z chromosomes haplotypes in flycatchers , 2019, Molecular ecology.

[13]  Jian Wang,et al.  Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly , 2019, Genome research.

[14]  Alexander Suh,et al.  The Genome of Blue-Capped Cordon-Bleu Uncovers Hidden Diversity of LTR Retrotransposons in Zebra Finch , 2019, Genes.

[15]  Scott A. Taylor,et al.  Insights from genomes into the evolutionary importance and prevalence of hybridization in nature , 2019, Nature Ecology & Evolution.

[16]  R. Burri,et al.  Parallel plumage colour evolution and introgressive hybridization in wheatears , 2018, Journal of evolutionary biology.

[17]  Peter L. Ralph,et al.  Widespread selection and gene flow shape the genomic landscape during a radiation of monkeyflowers , 2019, PLoS biology.

[18]  Kirk E. Lohmueller,et al.  Using Genomic Data to Infer Historic Population Dynamics of Nonmodel Organisms , 2018, Annual Review of Ecology, Evolution, and Systematics.

[19]  W. Forstmeier,et al.  Programmed DNA elimination of germline development genes in songbirds , 2018, Nature Communications.

[20]  J. Corbo,et al.  A non-coding region near Follistatin controls head colour polymorphism in the Gouldian finch , 2018, Proceedings of the Royal Society B.

[21]  H. Schielzeth,et al.  Success and failure in replication of genotype–phenotype associations: How does replication help in understanding the genetic basis of phenotypic variation in outbred populations? , 2018, Molecular ecology resources.

[22]  D. Garant,et al.  Wild GWAS—association mapping in natural populations , 2018, Molecular ecology resources.

[23]  Peter L. Ralph,et al.  The tempo of linked selection: rapid emergence of a heterogeneous genomic landscape during a radiation of monkeyflowers , 2018, bioRxiv.

[24]  E. Kirkness,et al.  Comparison of phasing strategies for whole human genomes , 2018, PLoS genetics.

[25]  A. Traulsen,et al.  The breakdown of genomic ancestry blocks in hybrid lineages given a finite number of recombination sites , 2018, Evolution; international journal of organic evolution.

[26]  E. Bongcam-Rudloff,et al.  A comprehensive model of DNA fragmentation for the preservation of High Molecular Weight DNA , 2018, bioRxiv.

[27]  Simon H. Martin,et al.  Interpreting the genomic landscape of introgression. , 2017, Current opinion in genetics & development.

[28]  F. Bonhomme,et al.  The origin and remolding of genomic islands of differentiation in the European sea bass , 2017, Nature Communications.

[29]  R. Burri Interpreting differentiation landscapes in the light of long‐term linked selection , 2017 .

[30]  R. Faria,et al.  Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow , 2017, Journal of evolutionary biology.

[31]  Vineet Bafna,et al.  HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies , 2017, Genome research.

[32]  N. Weisenfeld,et al.  Direct determination of diploid genome sequences , 2016, bioRxiv.

[33]  Victor Guryev,et al.  Dense and accurate whole-chromosome haplotyping of individual genomes , 2017, Nature Communications.

[34]  L. Excoffier,et al.  Ancient hybridization fuels rapid cichlid fish adaptive radiations , 2017, Nature Communications.

[35]  Shilpa Garg,et al.  WhatsHap: fast and accurate read-based phasing , 2016, bioRxiv.

[36]  Alexander Suh,et al.  Evolution of heterogeneous genome differentiation across multiple contact zones in a crow species complex , 2016, Nature Communications.

[37]  L. Wain,et al.  Haplotype estimation for biobank scale datasets , 2016, Nature Genetics.

[38]  Sriram Sankararaman,et al.  A genetic method for dating ancient genomes provides a direct estimate of human generation interval in the last 45,000 years , 2016, Proceedings of the National Academy of Sciences.

[39]  Peter L. Ralph,et al.  Beyond clines: lineages and haplotype blocks in hybrid zones , 2016, bioRxiv.

[40]  Hanlee P. Ji,et al.  Haplotyping germline and cancer genomes using high-throughput linked-read sequencing , 2015, Nature Biotechnology.

[41]  J. Wingfield,et al.  A supergene determines highly divergent male reproductive morphs in the ruff , 2015, Nature Genetics.

[42]  Po-Ru Loh,et al.  Fast and accurate long-range phasing in a UK Biobank cohort , 2015, Nature Genetics.

[43]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[44]  Pall I. Olason,et al.  Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers , 2015, Genome research.

[45]  Matthew W. Snyder,et al.  Haplotype-resolved genome sequencing: experimental methods and applications , 2015, Nature Reviews Genetics.

[46]  Serafim Batzoglou,et al.  Read clouds uncover variation in complex regions of the human genome , 2015, RECOMB.

[47]  M. Grabherr,et al.  Evolution of Darwin’s finches and their beaks revealed by genome sequencing , 2015, Nature.

[48]  H. Schielzeth,et al.  Challenges and prospects in genome‐wide quantitative trait loci mapping of standing genetic variation in natural populations , 2014, Annals of the New York Academy of Sciences.

[49]  Pall I. Olason,et al.  A high-density linkage map enables a second-generation collared flycatcher genome assembly and reveals the patterns of avian recombination rate variation and chromosomal evolution , 2014, Molecular ecology.

[50]  C. Alex Buerkle,et al.  Stick Insect Genomes Reveal Natural Selection’s Role in Parallel Speciation , 2014, Science.

[51]  R. Nielsen,et al.  On Detecting Incomplete Soft or Hard Selective Sweeps Using Haplotype Structure , 2014, Molecular biology and evolution.

[52]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[53]  Zachariah Gompert,et al.  Analyses of genetic ancestry enable key insights for molecular ecology , 2013, Molecular ecology.

[54]  Simon H. Martin,et al.  Genome-wide evidence for speciation with gene flow in Heliconius butterflies , 2013, Genome research.

[55]  Itsik Pe'er,et al.  Inference of historical migration rates via haplotype sharing , 2013, Bioinform..

[56]  R. Nielsen,et al.  Inferring Demographic History from a Spectrum of Shared Haplotype Lengths , 2013, PLoS genetics.

[57]  I. Pe’er,et al.  Length distributions of identity by descent reveal fine-scale demographic history. , 2012, American journal of human genetics.

[58]  P. Ericson,et al.  Convergent evolution of morphological and ecological traits in the open-habitat chat complex (Aves, Muscicapidae: Saxicolinae). , 2012, Molecular phylogenetics and evolution.

[59]  Alex A. Pollen,et al.  The genomic basis of adaptive evolution in threespine sticklebacks , 2012, Nature.

[60]  O. Delaneau,et al.  A linear complexity phasing method for thousands of genomes , 2011, Nature Methods.

[61]  D. Falush,et al.  Inference of Population Structure using Dense Haplotype Data , 2012, PLoS genetics.

[62]  B. Browning,et al.  Haplotype phasing: existing methods and new developments , 2011, Nature Reviews Genetics.

[63]  Roland Kays,et al.  A genome-wide perspective on the evolutionary history of enigmatic wolf-like canids. , 2011, Genome research.

[64]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[65]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[66]  R. Nielsen,et al.  Inference of Historical Changes in Migration Rate From the Lengths of Migrant Tracts , 2009, Genetics.

[67]  L. Rieseberg,et al.  The Rate of Genome Stabilization in Homoploid Hybrid Species , 2008, Evolution; international journal of organic evolution.

[68]  Matthew W. Hahn,et al.  Toward a Selection Theory of Molecular Evolution , 2008, Evolution; international journal of organic evolution.

[69]  Kevin R. Thornton,et al.  A New Approach for Using Genome Scans to Detect Recent Positive Selection in the Human Genome , 2007, PLoS biology.

[70]  Robert S. Harris,et al.  Improved pairwise alignment of genomic dna , 2007 .

[71]  J. Pritchard,et al.  A Map of Recent Positive Selection in the Human Genome , 2006, PLoS biology.

[72]  Pardis C Sabeti,et al.  Detecting recent positive selection in the human genome from haplotype structure , 2002, Nature.

[73]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[74]  R. Fisher,et al.  A fuller theory of “Junctions” in inbreeding , 1954, Heredity.

[75]  R. Fisher The theory of inbreeding , 1949 .