The population genomics of structural variation in a songbird genus

Structural variation (SV) accounts for a substantial part of genetic mutations segregating across eukaryotic genomes with important medical and evolutionary implications. Here, we characterized SV across evolutionary time scales in the songbird genus Corvus using de novo assembly and read mapping approaches. Combining information from short-read (N = 127) and long-read re-sequencing data (N = 31) as well as from optical maps (N = 16) revealed a total of 201,738 insertions, deletions and inversions. Population genetic analysis of SV in the Eurasian crow speciation model revealed an evolutionary young (~530,000 years) cis-acting 2.25-kb retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth of SV segregating in natural populations and demonstrate its evolutionary significance.

[1]  ZhengXiuwen,et al.  A high-performance computing toolset for relatedness and principal component analysis of SNP data , 2012 .

[2]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[3]  Kevin P. Johnson,et al.  Introgression of regulatory alleles and a missense coding mutation drive plumage pattern diversity in the rock pigeon , 2018, eLife.

[4]  M. Quail,et al.  The industrial melanism mutation in British peppered moths is a transposable element , 2016, Nature.

[5]  Ryan L. Collins,et al.  Multi-platform discovery of haplotype-resolved structural variation in human genomes , 2017, bioRxiv.

[6]  M. Schatz,et al.  Phased diploid genome assembly with single-molecule real-time sequencing , 2016, Nature Methods.

[7]  Thomas Zichner,et al.  DELLY: structural variant discovery by integrated paired-end and split-read analysis , 2012, Bioinform..

[8]  J. Wingfield,et al.  A supergene determines highly divergent male reproductive morphs in the ruff , 2015, Nature Genetics.

[9]  Xiaoyu Chen,et al.  Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications , 2016, Bioinform..

[10]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[11]  Pall I. Olason,et al.  A high-density linkage map enables a second-generation collared flycatcher genome assembly and reveals the patterns of avian recombination rate variation and chromosomal evolution , 2014, Molecular ecology.

[12]  Ryan M. Layer,et al.  LUMPY: a probabilistic framework for structural variant discovery , 2012, Genome Biology.

[13]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[14]  A. Pang,et al.  Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications , 2017, Genome research.

[15]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[16]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[17]  Yuanda Lv,et al.  The population genetics of structural variants in grapevine domestication , 2019, Nature Plants.

[18]  M. Wikelski,et al.  The genomic landscape underlying phenotypic integrity in the face of gene flow in crows , 2014, Science.

[19]  M. Schatz,et al.  Phased diploid genome assembly with single-molecule real-time sequencing , 2016, Nature Methods.

[20]  Robert S. Harris,et al.  Improved pairwise alignment of genomic dna , 2007 .

[21]  Pui-Yan Kwok,et al.  Genome maps across 26 human populations reveal population-specific patterns of structural variation , 2019, Nature Communications.

[22]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[23]  I. Kaj,et al.  Polymorphism Data Assist Estimation of the Nonsynonymous over Synonymous Fixation Rate Ratio ω for Closely Related Species , 2019, Molecular biology and evolution.

[24]  N. Saino,et al.  Epistatic mutations under divergent selection govern phenotypic variation in the crow hybrid zone , 2019, Nature Ecology & Evolution.

[25]  Michael C. Schatz,et al.  Assemblytics: a web analytics tool for the detection of variants from an assembly , 2016, Bioinform..

[26]  N. Vijay,et al.  Transcriptomics of colour patterning and coloration shifts in crows , 2015, Molecular ecology.

[27]  J. Korlach,et al.  A High-Quality, Long-Read De Novo Genome Assembly to Aid Conservation of Hawaiiʻs Last Remaining Crow Species , 2018, bioRxiv.

[28]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[29]  H. Innan,et al.  On the estimation of the insertion time of LTR retrotransposable elements. , 2010, Molecular biology and evolution.

[30]  Natural Selection Constrains Neutral Diversity Across a Wide Range of Species , 2014 .

[31]  A. Long,et al.  Structural variants exhibit widespread allelic heterogeneity and shape variation in complex traits , 2019, Nature Communications.

[32]  David Haussler,et al.  High-resolution comparative analysis of great ape genomes , 2018, Science.

[33]  L. Feuk,et al.  Structural variation in the human genome , 2006, Nature Reviews Genetics.

[34]  C. Feschotte,et al.  Regulatory activities of transposable elements: from conflicts to benefits , 2016, Nature Reviews Genetics.

[35]  H. Ellegren Microsatellite mutations in the germline: implications for evolutionary inference. , 2000, Trends in genetics : TIG.

[36]  F. Sedlazeck,et al.  Ancestral Admixture Is the Main Determinant of Global Biodiversity in Fission Yeast , 2019, Molecular biology and evolution.

[37]  M. K. Rudd,et al.  Human Structural Variation: Mechanisms of Chromosome Rearrangements. , 2015, Trends in genetics : TIG.

[38]  Michael C. Schatz,et al.  Accurate detection of complex structural variations using single molecule sequencing , 2017, Nature Methods.

[39]  Brendan L. O’Connell,et al.  Chromosome-scale shotgun assembly using an in vitro method for long-range linkage , 2015, Genome research.

[40]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[41]  David Levine,et al.  A high-performance computing toolset for relatedness and principal component analysis of SNP data , 2012, Bioinform..

[42]  J. Bennetzen,et al.  A unified classification system for eukaryotic transposable elements , 2007, Nature Reviews Genetics.

[43]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[44]  Nicholas W. VanKuren,et al.  Hidden genetic variation shapes the structure of functional elements in Drosophila , 2017, Nature Genetics.

[45]  Alexander Suh,et al.  Abundant recent activity of retrovirus‐like retrotransposons within and among flycatcher species implies a rich source of structural variation in songbird genomes , 2018, Molecular ecology.

[46]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[47]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[48]  Ben G. Holt,et al.  A supermatrix phylogeny of corvoid passerine birds (Aves: Corvides). , 2016, Molecular phylogenetics and evolution.

[49]  Petr Danecek,et al.  BCFtools/csq: haplotype-aware variant consequences , 2016, bioRxiv.

[50]  K. van Oers,et al.  Gene and transposable element methylation in great tit (Parus major) brain and blood , 2016, BMC Genomics.

[51]  Evan E. Eichler,et al.  Genetic variation and the de novo assembly of human genomes , 2015, Nature Reviews Genetics.

[52]  Timothy B Sackton,et al.  Natural Selection Constrains Neutral Diversity across A Wide Range of Species , 2014, bioRxiv.

[53]  Fritz J Sedlazeck,et al.  Piercing the dark matter: bioinformatics of long-range sequencing and mapping , 2018, Nature Reviews Genetics.

[54]  Kazutaka Katoh,et al.  Recent developments in the MAFFT multiple sequence alignment program , 2008, Briefings Bioinform..

[55]  Wolfgang Stephan,et al.  The evolutionary dynamics of repetitive DNA in eukaryotes , 1994, Nature.

[56]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[57]  Todd J. Vision,et al.  The Standing Pool of Genomic Structural Variation in a Natural Population of Mimulus guttatus , 2013, Genome biology and evolution.

[58]  H. Ellegren,et al.  Making sense of genomic islands of differentiation in light of speciation , 2016, Nature Reviews Genetics.

[59]  F. Balloux,et al.  Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast , 2016, Nature Communications.

[60]  Alexander Suh,et al.  Evolution of heterogeneous genome differentiation across multiple contact zones in a crow species complex , 2016, Nature Communications.

[61]  John Huddleston,et al.  An Incomplete Understanding of Human Genetic Variation , 2016, Genetics.

[62]  Melissa Gymrek,et al.  A genomic view of short tandem repeats. , 2017, Current opinion in genetics & development.