Chromonomer: A Tool Set for Repairing and Enhancing Assembled Genomes Through Integration of Genetic Maps and Conserved Synteny

The pace of the sequencing and computational assembly of novel reference genomes is accelerating. Though DNA sequencing technologies and assembly software tools continue to improve, biological features of genomes such as repetitive sequence as well as molecular artifacts that often accompany sequencing library preparation can lead to fragmented or chimeric assemblies. If left uncorrected, defects like these trammel progress on understanding genome structure and function, or worse, positively mislead such research. Fortunately, integration of additional, independent streams of information, such as a genetic map – particularly a marker-dense map from RADseq, for example – and conserved orthologous gene order from related taxa can be used to scaffold together unlinked, disordered fragments and to restructure a reference genome where it is incorrectly joined. We present a tool set for automating these processes, one that additionally tracks any changes to the assembly and to the genetic map, and which allows the user to scrutinize these changes with the help of web-based, graphical visualizations. Chromonomer takes a user-defined reference genome, a map of genetic markers, and, optionally, conserved synteny information to construct an improved reference genome of chromosome models: a “chromonome”. We demonstrate Chromonomer’s performance on genome assemblies and genetic maps that have disparate characteristics and levels of quality.

[1]  Russell E. Durrett,et al.  Assembly and diploid architecture of an individual human genome via single-molecule technologies , 2015, Nature Methods.

[2]  Elizabeth P. Murchison,et al.  Rapid evolutionary response to a transmissible cancer in Tasmanian devils , 2016, Nature Communications.

[3]  M. Snyder,et al.  High-throughput sequencing technologies. , 2015, Molecular cell.

[4]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[5]  Josh Goodman,et al.  Polytene chromosomal maps of 11 Drosophila species: the order of genomic scaffolds inferred from genetic and physical maps. , 2008, Genetics.

[6]  James G. Baldwin-Brown,et al.  Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage , 2016, bioRxiv.

[7]  Steven J. M. Jones,et al.  Insights into Conifer Giga-Genomes1 , 2014, Plant Physiology.

[8]  Catherine A. Wilson,et al.  Cold Fusion: Massive Karyotype Evolution in the Antarctic Bullhead Notothen Notothenia coriiceps , 2017, G3: Genes, Genomes, Genetics.

[9]  Jin-Wu Nam,et al.  The present and future of de novo whole-genome assembly , 2016, Briefings Bioinform..

[10]  A. Gnirke,et al.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data , 2010, Proceedings of the National Academy of Sciences.

[11]  Gordon Luikart,et al.  Population Genomics: Advancing Understanding of Nature , 2018 .

[12]  Mihai Pop,et al.  Genome assembly reborn: recent computational challenges , 2009, Briefings Bioinform..

[13]  K. Reinhardt,et al.  A Linkage Map and QTL Analysis for Pyrethroid Resistance in the Bed Bug Cimex lectularius , 2016, G3: Genes, Genomes, Genetics.

[14]  Jonathan M D Wood,et al.  Using optical mapping data for the improvement of vertebrate genome assemblies , 2015, GigaScience.

[15]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[16]  Isaac Y. Ho,et al.  Meraculous: De Novo Genome Assembly with Short Paired-End Reads , 2011, PloS one.

[17]  Ian T. Fiddes,et al.  Resolving the full spectrum of human genome variation using Linked-Reads , 2019, Genome research.

[18]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[19]  Julian Michael Catchen,et al.  Stacks 2: Analytical Methods for Paired-end Sequencing Improve RADseq-based Population Genomics , 2019, bioRxiv.

[20]  L. Keller,et al.  A Y-like social chromosome causes alternative colony organization in fire ants , 2013, Nature.

[21]  J. Mallet,et al.  Supergene Evolution Triggered by the Introgression of a Chromosomal Inversion , 2017, Current Biology.

[22]  Michael Hiller,et al.  Author Correction: The axolotl genome and the evolution of key tissue formation regulators , 2018, Nature.

[23]  F. Collins,et al.  The Human Genome Project: Lessons from Large-Scale Biology , 2003, Science.

[24]  Pavel A Pevzner,et al.  How to apply de Bruijn graphs to genome assembly. , 2011, Nature biotechnology.

[25]  J. Postlethwait,et al.  A new model army: Emerging fish models to study the genomics of vertebrate Evo-Devo. , 2015, Journal of experimental zoology. Part B, Molecular and developmental evolution.

[26]  T. Zhao,et al.  Network-based microsynteny analysis identifies major differences and genomic outliers in mammalian and angiosperm genomes , 2019, Proceedings of the National Academy of Sciences.

[27]  N. Weisenfeld,et al.  Direct determination of diploid genome sequences , 2016, bioRxiv.

[28]  C. Peichel,et al.  The genetic and molecular architecture of phenotypic diversity in sticklebacks , 2017, Philosophical Transactions of the Royal Society B: Biological Sciences.

[29]  Heng Li,et al.  Fast and accurate long-read assembly with wtdbg2 , 2019, Nature Methods.

[30]  Shaobin Lin,et al.  Comprehensive Genome Profiling of Single Sperm Cells by Multiple Annealing and Looping‐Based Amplification Cycles and Next‐Generation Sequencing from Carriers of Robertsonian Translocation , 2017, Annals of human genetics.

[31]  O. Jaillon,et al.  Gene loss and evolutionary rates following whole-genome duplication in teleost fishes. , 2006, Molecular biology and evolution.

[32]  Etsuko N. Moriyama,et al.  Evolution of a Large, Conserved, and Syntenic Gene Family in Insects , 2012, G3: Genes | Genomes | Genetics.

[33]  Marius Roesti,et al.  Recombination in the threespine stickleback genome—patterns and consequences , 2013, Molecular ecology.

[34]  Mihai Pop,et al.  Modern technologies and algorithms for scaffolding assembled genomes , 2019, PLoS Comput. Biol..

[35]  J. Bednar,et al.  Alpha-trimmed means and their relationship to median filters , 1984 .

[36]  Jordan S. Ramsdell,et al.  A Male-Specific Genetic Map of the Microcrustacean Daphnia pulex Based on Single-Sperm Whole-Genome Sequencing , 2015, Genetics.

[37]  The genome sequence of the Antarctic bullhead notothen reveals evolutionary adaptations to a cold environment , 2014, Genome Biology.

[38]  Morgan Wirthlin,et al.  Conserved syntenic clusters of protein coding genes are missing in birds , 2014, Genome Biology.

[39]  A. Amores,et al.  The genome of the Gulf pipefish enables understanding of evolutionary innovations , 2016, Genome Biology.

[40]  Akihiro Shima,et al.  A medaka gene map: the trace of ancestral vertebrate proto-chromosomes revealed by comparative gene mapping. , 2004, Genome research.

[41]  R. Gibbs,et al.  Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology , 2012, PloS one.

[42]  D. Ebert,et al.  A high-density genetic map reveals variation in recombination rate across the genome of Daphnia magna , 2016, BMC Genetics.

[43]  Kin Fai Au,et al.  A comparative evaluation of hybrid error correction methods for error-prone long reads , 2019, Genome Biology.

[44]  P. Phillips,et al.  Using Population Genomics to Detect Selection in Natural Populations: Key Concepts and Methodological Considerations , 2010, International Journal of Plant Sciences.

[45]  D. Kleinjan,et al.  Long-range control of gene expression: emerging mechanisms and disruption in disease. , 2005, American journal of human genetics.

[46]  R. Wilson,et al.  Modernizing Reference Genome Assemblies , 2011, PLoS biology.

[47]  J. Catchen,et al.  Genomic Resources for Darters (Percidae: Etheostominae) Provide Insight into Postzygotic Barriers Implicated in Speciation , 2019, Molecular biology and evolution.

[48]  L. Bernatchez,et al.  Salmonid Chromosome Evolution as Revealed by a Novel Method for Comparing RADseq Linkage Maps , 2016, bioRxiv.

[49]  Yu Lin,et al.  Assembly of long, error-prone reads using repeat graphs , 2018, Nature Biotechnology.

[50]  Iris Tzafrir,et al.  A Sequence-Based Map of Arabidopsis Genes with Mutant Phenotypes1,212 , 2003, Plant Physiology.

[51]  M. Blaxter,et al.  Genome-wide genetic marker discovery and genotyping using next-generation sequencing , 2011, Nature Reviews Genetics.

[52]  Han-Woo Kim,et al.  Antarctic blackfin icefish genome reveals adaptations to extreme environments , 2019, Nature Ecology & Evolution.

[53]  P. Etter,et al.  Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers , 2008, PloS one.

[54]  Michael Hiller,et al.  The axolotl genome and the evolution of key tissue formation regulators , 2018, Nature.

[55]  Charles E. Chapple,et al.  Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype , 2004, Nature.

[56]  M. Perry,et al.  A transposable element insertion is associated with an alternative life history strategy , 2019, Nature Communications.

[57]  Jian Wang,et al.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler , 2012, GigaScience.

[58]  Alberto Policriti,et al.  GapFiller: a de novo assembly approach to fill the gap within paired reads , 2012, BMC Bioinformatics.

[59]  G. Luikart,et al.  Harnessing the power of RADseq for ecological and evolutionary genomics , 2016, Nature Reviews Genetics.

[60]  A. Perkins,et al.  Evolution of gene function and regulatory control after whole-genome duplication: comparative analyses in vertebrates. , 2009, Genome research.

[61]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[62]  A. Amores,et al.  A RAD-Tag Genetic Map for the Platyfish (Xiphophorus maculatus) Reveals Mechanisms of Karyotype Evolution Among Teleost Fish , 2014, Genetics.

[63]  T. S. Painter,et al.  A NEW METHOD FOR THE STUDY OF CHROMOSOME REARRANGEMENTS AND THE PLOTTING OF CHROMOSOME MAPS. , 1933, Science.

[64]  C. Hutchison DNA sequencing: bench to bedside and beyond , 2007, Nucleic acids research.