Chromosome-level quality scaffolding of brown algal genomes using InstaGRAAL, a proximity ligation-based scaffolder

Hi-C has become a popular technique in recent genome assembly projects. Hi-C exploits contact frequencies between pairs of loci to bridge and order contigs in draft genomes, resulting in chromosome-level assemblies. However, application of this approach is currently hampered by a lack of robust programs that are capable of effectively treating this type of data, particularly open source programs. We developed instaGRAAL, a complete overhaul of the GRAAL program, which has adapted the latter to allow efficient assembly of large genomes. Both GRAAL, and instaGRAAL use a Markov Chain Monte Carlo algorithm to perform Hi-C scaffolding, but instaGRAAL features a number of improvements including a modular polishing approach that optionally integrates independent data. To validate the program, we used it to generate chromosome-level assemblies for two brown algae, Desmarestia herbacea and the model Ectocarpus sp., and quantified improvements compared to the initial draft for the latter. Overall, instaGRAAL is a program able to generate, using default parameters with minimal human intervention, near-complete assemblies.

[1]  Susana M. Coelho,et al.  Convergent recruitment of TALE homeodomain life cycle regulators to direct sporophyte development in land plants and brown algae , 2019, eLife.

[2]  Jonathan D. G. Jones,et al.  Shifting the limits in wheat research and breeding using a fully annotated reference genome , 2018, Science.

[3]  A. Thierry,et al.  Characterizing meiotic chromosomes' structure and pairing using a designer sequence optimized for Hi‐C , 2018, Molecular systems biology.

[4]  Dmitry Antipov,et al.  Versatile genome assembly evaluation with QUAST-LG , 2018, Bioinform..

[5]  Fritz J Sedlazeck,et al.  Piercing the dark matter: bioinformatics of long-range sequencing and mapping , 2018, Nature Reviews Genetics.

[6]  Muhammad Shoaib,et al.  A Comprehensive Study of De Novo Genome Assemblers: Current Challenges and Future Prospective , 2018, Evolutionary bioinformatics online.

[7]  Sergey Koren,et al.  Integrating Hi-C links with assembly graphs for chromosome-scale assembly , 2018, bioRxiv.

[8]  A. Thierry,et al.  Cohesins and condensins orchestrate the 4D dynamics of yeast chromosomes during the cell cycle , 2017, The EMBO journal.

[9]  M. Marbouty,et al.  Proximity ligation scaffolding and comparison of two Trichoderma reesei strains genomes , 2017, Biotechnology for Biofuels.

[10]  S. Lonardi,et al.  A comparative evaluation of genome assembly reconciliation tools , 2017, Genome Biology.

[11]  Lieven Sterck,et al.  Re-annotation, improved large-scale assembly and establishment of a catalogue of noncoding loci for the genome of the model brown alga Ectocarpus. , 2017, The New phytologist.

[12]  Steven G. Schroeder,et al.  Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome , 2017, Nature Genetics.

[13]  Susana M. Coelho,et al.  High-density genetic map and identification of QTLs for responses to temperature and salinity stresses in the model brown alga Ectocarpus , 2017, Scientific Reports.

[14]  Neva C. Durand,et al.  De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds , 2016, Science.

[15]  Lyam Baudry,et al.  Scaffolding bacterial genomes and probing host-virus interactions in gut microbiome by proximity ligation (chromosome capture) assay , 2017, Science Advances.

[16]  A. Thierry,et al.  Choreography of budding yeast chromosomes during the cell cycle , 2016, bioRxiv.

[17]  Daisy E. Pagete An end-to-end assembly of the Aedes aegypti genome , 2016, 1605.04619.

[18]  Brendan L. O’Connell,et al.  Chromosome-scale shotgun assembly using an in vitro method for long-range linkage , 2015, Genome research.

[19]  Romain Koszul,et al.  Contact genomics: scaffolding and phasing (meta)genomes using chromosome 3D physical signatures , 2015, FEBS letters.

[20]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[21]  Antoine Margeot,et al.  High-quality genome (re)assembly using chromosomal contact data , 2014, Nature Communications.

[22]  Romain Koszul,et al.  Metagenomic chromosome conformation capture (meta3C) unveils the diversity of chromosome organization in microorganisms , 2014, eLife.

[23]  Susana M. Coelho,et al.  A Haploid System of Sex Determination in the Brown Alga Ectocarpus sp. , 2014, Current Biology.

[24]  Andrew C. Adey,et al.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions , 2013, Nature Biotechnology.

[25]  Noam Kaplan,et al.  High-throughput genome scaffolding from in-vivo DNA interaction frequency , 2013, Nature Biotechnology.

[26]  Michael Roberts,et al.  The MaSuRCA genome assembler , 2013, Bioinform..

[27]  Inanç Birol,et al.  Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species , 2013, GigaScience.

[28]  L. Mirny,et al.  Iterative Correction of Hi-C Data Reveals Hallmarks of Chromosome Organization , 2012, Nature Methods.

[29]  A. Cournac,et al.  Normalization of a chromosomal contact map , 2012, BMC Genomics.

[30]  M. Schatz,et al.  Algorithms Gage: a Critical Evaluation of Genome Assemblies and Assembly Material Supplemental , 2008 .

[31]  A. Tanay,et al.  Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture , 2011, Nature Genetics.

[32]  Susana M. Coelho,et al.  OUROBOROS is a master regulator of the gametophyte to sporophyte life cycle transition in the brown alga Ectocarpus , 2011, Proceedings of the National Academy of Sciences.

[33]  Bradley P. Coe,et al.  Genome structural variation discovery and genotyping , 2011, Nature Reviews Genetics.

[34]  Corinne Da Silva,et al.  The Ectocarpus genome and the independent evolution of multicellularity in brown algae , 2010, Nature.

[35]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[36]  D. G. Müller Untersuchungen zur Entwicklungsgeschichte der Braunalge Ectocarpus siliculosus Aus Neapel , 1966, Planta.

[37]  J. Dekker,et al.  Capturing Chromosome Conformation , 2002, Science.

[38]  K Rippe,et al.  Making contacts on a nucleic acid polymer. , 2001, Trends in biochemical sciences.

[39]  A. Peters,et al.  Life history and taxonomy of two populations of ligulate Desmarestia (Phaeophyceae) from Chile , 1986 .

[40]  D. Müller [Studies on the life cycle of the brown alga Ectocarpus siliculosus from Naples, Italy]. , 1966, Planta.