De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds

Hi-C for mosquito genomes Most genomes sequenced today are determined through the generation of short sequenced bits of DNA that are computationally pieced together like a jigsaw puzzle. This has resulted in the need for funds and additional data to fill in gaps in order to fully assemble the many chromosomes that make up a eukaryotic genome. Dudchenko et al. used the Hi-C method, which measures the distance between contact points within and between chromosomes for scaffold validation, together with correction and ordering to more completely determine the arrangement of short sequencing reads for genome mapping. They validated their approach through the de novo generation of a complete human genome. A comparative analysis of mosquito genomes was made possible by improving the Culex quinquefasciatus genome assembly and generating the genome of Aedes aegypti, the vector of Zika virus. Science, this issue p. 92 The DNA proximity ligation method Hi-C was used to create a genome assembly for the mosquito Aedes aegypti. The Zika outbreak, spread by the Aedes aegypti mosquito, highlights the need to create high-quality assemblies of large genomes in a rapid and cost-effective way. Here we combine Hi-C data with existing draft assemblies to generate chromosome-length scaffolds. We validate this method by assembling a human genome, de novo, from short reads alone (67× coverage). We then combine our method with draft sequences to create genome assemblies of the mosquito disease vectors Ae. aegypti and Culex quinquefasciatus, each consisting of three scaffolds corresponding to the three chromosomes in each species. These assemblies indicate that almost all genomic rearrangements among these species occur within, rather than between, chromosome arms. The genome assembly procedure we describe is fast, inexpensive, and accurate, and can be applied to many species.

[1]  Kevin A. Burns,et al.  Genome evolution in the allotetraploid frog Xenopus laevis , 2016, Nature.

[2]  Catherine L. Peichel,et al.  Improvement of the threespine stickleback (Gasterosteus aculeatus) genome using a Hi-C-based Proximity-Guided Assembly method , 2016, bioRxiv.

[3]  Neva C. Durand,et al.  Deletion of DXZ4 on the human inactive X chromosome alters higher-order genome architecture , 2016, Proceedings of the National Academy of Sciences.

[4]  Timothy P. L. Smith,et al.  Single-molecule sequencing and conformational capture enable de novo mammalian reference genomes , 2016, bioRxiv.

[5]  James T. Robinson,et al.  Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. , 2016, Cell systems.

[6]  Neva C. Durand,et al.  Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. , 2016, Cell systems.

[7]  M. Schatz,et al.  Phased diploid genome assembly with single-molecule real-time sequencing , 2016, Nature Methods.

[8]  Daisy E. Pagete An end-to-end assembly of the Aedes aegypti genome , 2016, 1605.04619.

[9]  David B. Jaffe,et al.  Evaluation of DISCOVAR de novo using a mosquito sample for cost-effective short-read genome assembly , 2016, BMC Genomics.

[10]  Daniel R. Zerbino,et al.  Ensembl 2016 , 2015, Nucleic Acids Res..

[11]  Brendan L. O’Connell,et al.  Chromosome-scale shotgun assembly using an in vitro method for long-range linkage , 2015, Genome research.

[12]  Neva C. Durand,et al.  Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes , 2015, Proceedings of the National Academy of Sciences.

[13]  Russell E. Durrett,et al.  Assembly and diploid architecture of an individual human genome via single-molecule technologies , 2015, Nature Methods.

[14]  Patrick A. Glass,et al.  A standard cytogenetic map of Culex quinquefasciatus polytene chromosomes in application for fine-scale physical mapping , 2015, Parasites & Vectors.

[15]  A. N. Naumenko,et al.  Correction: Mitotic-Chromosome-Based Physical Mapping of the Culex quinquefasciatus Genome , 2015, PloS one.

[16]  William Stafford Noble,et al.  Accurate identification of centromere locations in yeast genomes using Hi-C , 2015, Nucleic acids research.

[17]  A. N. Naumenko,et al.  Mitotic-Chromosome-Based Physical Mapping of the Culex quinquefasciatus Genome , 2015, PloS one.

[18]  Sandra Gesing,et al.  VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases , 2014, Nucleic Acids Res..

[19]  Xun Xu,et al.  Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology , 2014, GigaScience.

[20]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[21]  Antoine Margeot,et al.  High-quality genome (re)assembly using chromosomal contact data , 2014, Nature Communications.

[22]  C. Nusbaum,et al.  Comprehensive variation discovery in single human genomes , 2014, Nature Genetics.

[23]  Chunhong Mao,et al.  Genomic composition and evolution of Aedes aegypti chromosomes revealed by the analysis of physically mapped supercontigs , 2014, BMC Biology.

[24]  Punita Juneja,et al.  Assembly of the Genome of the Disease Vector Aedes aegypti onto a Genetic Linkage Map Allows Mapping of Genes Affecting Disease Transmission , 2014, PLoS neglected tropical diseases.

[25]  Andrew C. Adey,et al.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions , 2013, Nature Biotechnology.

[26]  Noam Kaplan,et al.  High-throughput genome scaffolding from in-vivo DNA interaction frequency , 2013, Nature Biotechnology.

[27]  D. Chadee,et al.  Composite linkage map and enhanced genome map for Culex pipiens complex mosquitoes. , 2013, The Journal of heredity.

[28]  C. Peterson,et al.  Chromatin dynamics , 2013, Cell cycle.

[29]  A. Gnirke,et al.  Paired-end sequencing of Fosmid libraries by Illumina , 2012, Genome research.

[30]  Ole Tange,et al.  GNU Parallel: The Command-Line Power Tool , 2011, login Usenix Mag..

[31]  A. Gnirke,et al.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data , 2010, Proceedings of the National Academy of Sciences.

[32]  Claire Fraser-Liggett,et al.  Sequencing of Culex quinquefasciatus Establishes a Platform for Mosquito Comparative Genomics , 2010, Science.

[33]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[34]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[35]  M. Ferguson-Smith,et al.  Mammalian karyotype evolution , 2007, Nature Reviews Genetics.

[36]  Evgeny M. Zdobnov,et al.  Genome Sequence of Aedes aegypti, a Major Arbovirus Vector , 2007, Science.

[37]  Jill P Mesirov,et al.  Assembly of polymorphic genomes: algorithms and application to Ciona savignyi. , 2005, Genome research.

[38]  E. Mauceli,et al.  Whole-genome sequence assembly for mammalian genomes: Arachne 2. , 2003, Genome research.

[39]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[40]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[41]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .