Insights from a chum salmon (Oncorhynchus keta) genome assembly regarding whole-genome duplication and nucleotide variation influencing gene function

Abstract Chum salmon are ecologically important to Pacific Ocean ecosystems and commercially important to fisheries. To improve the genetic resources available for this species, we sequenced and assembled the genome of a male chum salmon using Oxford Nanopore read technology and the Flye genome assembly software (contig N50: ∼2 Mbp, complete BUSCOs: ∼98.1%). We also resequenced the genomes of 59 chum salmon from hatchery sources to better characterize the genome assembly and the diversity of nucleotide variants impacting phenotype variation. With genomic sequences from a doubled haploid individual, we were able to identify regions of the genome assembly that have been collapsed due to high sequence similarity between homeologous (duplicated) chromosomes. The homeologous chromosomes are relics of an ancient salmonid-specific genome duplication. These regions were enriched with genes whose functions are related to the immune system and responses to toxins. From analyzing nucleotide variant annotations of the resequenced genomes, we were also able to identify genes that have increased levels of variants thought to moderately impact gene function. Genes related to the immune system and the detection of chemical stimuli (olfaction) had increased levels of these variants based on a gene ontology enrichment analysis. The tandem organization of many of the enriched genes raises the question of why they have this organization.

[1]  M. Schatz,et al.  Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing , 2022, Genome biology.

[2]  Cathy H. Wu,et al.  UniProt: the Universal Protein Knowledgebase in 2023 , 2022, Nucleic Acids Res..

[3]  B. Koop,et al.  Timing of post‐fertilization pressure shock treatment for the production of mitotic gynogens in six salmonid species , 2022, North American Journal of Aquaculture.

[4]  Steven J. M. Jones,et al.  Isolation-by-distance and population-size history inferences from the coho salmon (Oncorhynchus kisutch) genome , 2022, bioRxiv.

[5]  Steven J. M. Jones,et al.  The pink salmon genome: Uncovering the genomic consequences of a two-year life cycle , 2021, PloS one.

[6]  M. Schartl,et al.  Lessons from an unusual vertebrate sex-determining gene , 2021, Philosophical Transactions of the Royal Society B.

[7]  M. Miyamoto,et al.  Balancing selection maintains ancient polymorphisms at conserved enhancers for the olfactory receptor genes of a Chinese marine fish , 2021, Molecular ecology.

[8]  C. Stehlik,et al.  NLRP7: From inflammasome regulation to human disease , 2021, Immunology.

[9]  L. Andersson,et al.  A Chromosome-Level Assembly of Blunt Snout Bream (Megalobrama amblycephala) Genome Reveals an Expansion of Olfactory Receptor Genes in Freshwater Fish , 2021, Molecular biology and evolution.

[10]  T. Beacham,et al.  Estimation of Conservation Unit and population contribution to Chinook salmon mixed-stock fisheries in British Columbia, Canada using direct DNA sequencing for single nucleotide polymorphisms , 2021 .

[11]  Y. Palti,et al.  A long reads-based de-novo assembly of the genome of the Arlee homozygous line reveals chromosomal rearrangements in rainbow trout. , 2021, G3.

[12]  Thomas M. Keane,et al.  Twelve years of SAMtools and BCFtools , 2020, GigaScience.

[13]  B. Koop,et al.  The sockeye salmon genome, transcriptome, and analyses identifying population defining regions of the genome , 2020, PloS one.

[14]  Fan Xiong,et al.  The expanding and function of NLRC3 or NLRC3-like in teleost fish: recent advances and novel insights. , 2020, Developmental and comparative immunology.

[15]  T. Kanneganti,et al.  NLRP12 in innate immunity and inflammation. , 2020, Molecular aspects of medicine.

[16]  E. Elinav,et al.  Inflammasome activation and regulation: toward a better understanding of complex mechanisms , 2020, Cell Discovery.

[17]  L. Bernatchez,et al.  Accurate estimation of conservation unit contribution to coho salmon mixed-stock fisheries in British Columbia, Canada, using direct DNA sequencing for single nucleotide polymorphisms , 2020 .

[18]  L. Seeb,et al.  Network Analysis of Linkage Disequilibrium Reveals Genome Architecture in Chum Salmon , 2020, G3: Genes, Genomes, Genetics.

[19]  Julian M. Catchen,et al.  Chromonomer: A Tool Set for Repairing and Enhancing Assembled Genomes Through Integration of Genetic Maps and Conserved Synteny , 2020, bioRxiv.

[20]  Salmonids * , 2019, Reproductive Seasonality in Teleosts.

[21]  S. Zoller,et al.  A de novo chromosome-level genome assembly of Coregonus sp. “Balchen”: one representative of the Swiss Alpine whitefish radiation , 2019 .

[22]  P. Westley The Ocean Ecology of Pacific Salmon and Trout. Edited by Richard J. Beamish. Bethesda (Maryland): American Fisheries Society. $98.00. xii + 1147 p.; ill.; index. ISBN: 978-1-934874-45-5. 2018. , 2019, The Quarterly Review of Biology.

[23]  Oswaldo Trelles,et al.  Ultra-fast genome comparison for large-scale genomic experiments , 2019, Scientific Reports.

[24]  S. Lomvardas,et al.  Olfactory receptor genes make the case for inter-chromosomal interactions. , 2019, Current opinion in genetics & development.

[25]  Xingang Wang,et al.  RaGOO: fast and accurate reference-guided scaffolding of draft genomes , 2019, Genome Biology.

[26]  M. Schartl,et al.  The unusual rainbow trout sex determination gene hijacked the canonical vertebrate gonadal differentiation pathway , 2018, Proceedings of the National Academy of Sciences.

[27]  J. Flowers,et al.  Origins and geographic diversification of African rice (Oryza glaberrima) , 2018, bioRxiv.

[28]  B. Koop,et al.  Chinook salmon (Oncorhynchus tshawytscha) genome and transcriptome , 2018, PloS one.

[29]  Yu Lin,et al.  Assembly of long, error-prone reads using repeat graphs , 2018, Nature Biotechnology.

[30]  J. Mainland,et al.  Genetic variation across the human olfactory receptor repertoire alters odor perception , 2017, Proceedings of the National Academy of Sciences.

[31]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[32]  F. Allendorf Evolution in a Toxic World , 2017 .

[33]  Y. Peer,et al.  The evolutionary significance of polyploidy , 2017, Nature Reviews Genetics.

[34]  A. Whitehead,et al.  When evolution is the solution to pollution: Key principles, and lessons from rapid repeated adaptation of killifish (Fundulus heteroclitus) populations , 2017, Evolutionary applications.

[35]  L. Bernatchez,et al.  Sex Chromosome Evolution, Heterochiasmy, and Physiological QTL in the Salmonid Brook Charr Salvelinus fontinalis , 2017, G3: Genes, Genomes, Genetics.

[36]  A. Muttray,et al.  Deletion and Copy Number Variation of Y-Chromosomal Regions in Coho Salmon, Chum Salmon, and Pink Salmon Populations , 2017 .

[37]  Niranjan Nagarajan,et al.  Fast and accurate de novo genome assembly from long uncorrected reads. , 2017, Genome research.

[38]  Niklaus J Grünwald,et al.  vcfr: a package to manipulate and visualize variant call format data in R , 2017, Molecular ecology resources.

[39]  T. Benfey Effectiveness of triploidy as a management tool for reproductive containment of farmed fish: Atlantic salmon (Salmo salar) as a case study , 2016 .

[40]  L. Bernatchez,et al.  Salmonid Chromosome Evolution as Revealed by a Novel Method for Comparing RADseq Linkage Maps , 2016, bioRxiv.

[41]  Steven J. M. Jones,et al.  The Atlantic salmon genome provides insights into rediploidization , 2016, Nature.

[42]  L. Seeb,et al.  Linkage mapping with paralogs exposes regions of residual tetrasomic inheritance in chum salmon (Oncorhynchus keta) , 2016, Molecular ecology resources.

[43]  L. Seeb,et al.  Chum Salmon Genetic Diversity in the Northeastern Pacific Ocean Assessed with Single Nucleotide Polymorphisms (SNPs): Applications to Fishery Management , 2015 .

[44]  R. Dirks,et al.  The Olfactory Transcriptome and Progression of Sexual Maturation in Homing Chum Salmon Oncorhynchus keta , 2015, PloS one.

[45]  W. Davidson,et al.  Genomic Instability of the Sex-Determining Locus in Atlantic Salmon (Salmo salar) , 2015, G3: Genes, Genomes, Genetics.

[46]  Ying Chen,et al.  High speed BLASTN: an accelerated MegaBLAST search tool , 2015, Nucleic acids research.

[47]  M. Limborg,et al.  Effects of crossovers between homeologs on inheritance and population genomics in polyploid-derived salmonid fishes. , 2015, The Journal of heredity.

[48]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[49]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[50]  S. Neuhauss,et al.  Whole-genome duplication in teleost fishes and its evolutionary consequences , 2014, Molecular Genetics and Genomics.

[51]  R. Devlin,et al.  Comparative Mapping Between Coho Salmon (Oncorhynchus kisutch) and Three Other Salmonids Suggests a Role for Chromosomal Rearrangements in the Retention of Duplicated Regions Following a Whole Genome Duplication Event , 2014, G3: Genes, Genomes, Genetics.

[52]  Stephen D. Turner,et al.  qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots , 2014, bioRxiv.

[53]  D. Chalopin,et al.  The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates , 2014, Nature Communications.

[54]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[55]  I. Johnston,et al.  A well-constrained estimate for the timing of the salmonid whole genome duplication reveals major decoupling from species diversification , 2014, Proceedings of the Royal Society B: Biological Sciences.

[56]  T. Pan,et al.  Diversity of human tRNA genes from the 1000-genomes project , 2013, RNA biology.

[57]  L. Seeb,et al.  Secondary contact and changes in coastal habitat availability influence the nonequilibrium population structure of a salmonid (Oncorhynchus keta) , 2013, Molecular ecology.

[58]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[59]  R. Macdonald,et al.  The trouble with salmon: relating pollutant exposure to toxic effect in species with transformational life histories and lengthy migrations , 2013 .

[60]  J. Henshall,et al.  Evidence for multiple sex-determining loci in Tasmanian Atlantic salmon (Salmo salar) , 2013, Heredity.

[61]  L. Bernatchez,et al.  Mapping phenotypic, expression and transmission ratio distortion QTL using RAD markers in the Lake Whitefish (Coregonus clupeaformis) , 2013, Molecular ecology.

[62]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[63]  R. Guyomard,et al.  The sexually dimorphic on the Y-chromosome gene (sdY) is a conserved male-specific Y-chromosome sequence in many salmonids , 2012, Evolutionary applications.

[64]  R. Guyomard,et al.  An Immune-Related Gene Evolved into the Master Sex-Determining Gene in Rainbow Trout, Oncorhynchus mykiss , 2012, Current Biology.

[65]  B. Koop,et al.  Identification of olfactory receptor genes in Atlantic salmon Salmo salar. , 2012, Journal of fish biology.

[66]  David H. Baldwin,et al.  Low-level copper exposures increase visibility and vulnerability of juvenile coho salmon to cutthroat trout predators. , 2012, Ecological applications : a publication of the Ecological Society of America.

[67]  W. Heard,et al.  Overview of salmon stock enhancement in southeast Alaska and compatibility with maintenance of hatchery and wild stocks , 2012, Environmental Biology of Fishes.

[68]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[69]  Thibaut Jombart,et al.  adegenet 1.3-1: new tools for the analysis of genome-wide SNP data , 2011, Bioinform..

[70]  Rachel D. Field,et al.  Sea to sky: impacts of residual salmon-derived nutrients on estuarine breeding bird communities , 2011, Proceedings of the Royal Society B: Biological Sciences.

[71]  B. Koop,et al.  Expression of olfactory receptors in different life stages and life histories of wild Atlantic salmon (Salmo salar) , 2011, Molecular ecology.

[72]  Y. Watanuki,et al.  The relationship between pink salmon biomass and the body condition of short-tailed shearwaters in the Bering Sea: can fish compete with seabirds? , 2011, Proceedings of the Royal Society B: Biological Sciences.

[73]  Matko Bosnjak,et al.  REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms , 2011, PloS one.

[74]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[75]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[76]  T. Harkins,et al.  Transcriptome sequencing and high‐resolution melt analysis advance single nucleotide polymorphism discovery in duplicated salmonids , 2011, Molecular ecology resources.

[77]  Sigbjørn Lien,et al.  Genotype calling and mapping of multisite variants using an Atlantic salmon iSelect SNP array , 2011, Bioinform..

[78]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[79]  N. Scholz,et al.  Olfactory toxicity in fishes. , 2010, Aquatic toxicology.

[80]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[81]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[82]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[83]  L. Bernatchez,et al.  MHC standing genetic variation and pathogen resistance in wild Atlantic salmon , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[84]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[85]  G. Lamberti,et al.  Pacific salmon effects on stream ecosystems: a quantitative synthesis , 2009, Oecologia.

[86]  P. Ross,et al.  Partitioning of current‐use and legacy pesticides in salmon habitat in British Columbia, Canada , 2008, Environmental toxicology and chemistry.

[87]  T. Beacham,et al.  Population structure and stock identification of chum salmon Oncorhynchus keta from Japan determined by microsatellite DNA variation , 2008, Fisheries Science.

[88]  Thibaut Jombart,et al.  adegenet: a R package for the multivariate analysis of genetic markers , 2008, Bioinform..

[89]  S. López,et al.  Overdominance in the human genome and olfactory receptor activity. , 2008, Molecular biology and evolution.

[90]  L. Seeb,et al.  Number of Alleles as a Predictor of the Relative Assignment Accuracy of Short Tandem Repeat (STR) and Single‐Nucleotide‐Polymorphism (SNP) Baselines for Chum Salmon , 2008 .

[91]  Martin Vingron,et al.  Improved detection of overrepresentation of Gene-Ontology annotations with parent-child analysis , 2007, Bioinform..

[92]  L. Seeb,et al.  Thirty-eight single nucleotide polymorphism markers for high-throughput genotyping of chum salmon , 2007 .

[93]  J. Fellman,et al.  Salmon influences on dissolved organic matter in a coastal temperate brownwater stream: An application of fluorescence spectroscopy , 2007 .

[94]  R. Devlin,et al.  Identification of the sex chromosome pair in chum salmon (Oncorhynchus keta) and pink salmon (Oncorhynchus gorbuscha) , 2007, Cytogenetic and Genome Research.

[95]  T. Pan,et al.  Diversity of tRNA genes in eukaryotes , 2006, Nucleic acids research.

[96]  G. Wagner,et al.  Proceedings of the SMBE Tri-National Young Investigators' Workshop 2005. What is the role of genome duplication in the evolution of complexity and diversity? , 2006, Molecular biology and evolution.

[97]  R. Naiman,et al.  Keystone Interactions: Salmon and Bear in Riparian Forests of Alaska , 2006, Ecosystems.

[98]  E. P. Pister Trout and Salmon of North America , 2005, Copeia.

[99]  L. Seeb,et al.  Use of sequence data from rainbow trout and Atlantic salmon for SNP detection in Pacific salmon , 2005, Molecular ecology.

[100]  R. Devlin,et al.  Identification of the sex chromosome pair in coho salmon (Oncorhynchus kisutch): lack of conservation of the sex linkage group with chinook salmon (Oncorhynchus tshawytscha) , 2005, Cytogenetic and Genome Research.

[101]  L. Seeb,et al.  Characterization of 13 single nucleotide polymorphism markers for chum salmon , 2005 .

[102]  Jeroen Raes,et al.  Duplication and divergence: the evolution of new genes and old ideas. , 2004, Annual review of genetics.

[103]  M. Nei,et al.  Evolution of olfactory receptor genes in the human genome , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[104]  Jianzhi Zhang Evolution by gene duplication: an update , 2003 .

[105]  C. Landry,et al.  MHC studies in nonmodel vertebrates: what have we learned about natural selection in 15 years? , 2003, Journal of evolutionary biology.

[106]  M. Wipfli,et al.  Influence of decomposing Pacific salmon carcasses on macroinvertebrate growth and standing stock in southeastern Alaska streams , 2002, Journal of the North American Benthological Society.

[107]  Robert J. Naiman,et al.  EFFECTS OF SALMON-DERIVED NITROGEN ON RIPARIAN FOREST GROWTH AND IMPLICATIONS FOR STREAM PRODUCTIVITY , 2001 .

[108]  A. Davis,et al.  Loading estimates of lead, copper, cadmium, and zinc in urban runoff from specific sources. , 2001, Chemosphere.

[109]  P. Ráb,et al.  Chromosome evolution in the Salmonidae (Pisces): an update , 2001, Biological reviews of the Cambridge Philosophical Society.

[110]  K. U. Sprague,et al.  TATA-Binding Protein–TATA Interaction Is a Key Determinant of Differential Transcription of Silkworm Constitutive and Silk Gland-Specific tRNAAla Genes , 2000, Molecular and Cellular Biology.

[111]  Karl C. Halupka,et al.  Anadromous fish as keystone species in vertebrate communities , 1995 .

[112]  H. Blankenship,et al.  Genetic Diversity Patterns of Chum Salmon in the Pacific Northwest , 1994 .

[113]  M. Zúñiga,et al.  The nucleotide sequence of two silk gland alanine tRNAs: Implications for fibroin synthesis and for initiator tRNA structure , 1977, Cell.

[114]  G. León,et al.  Specific alanine‐tRNA species associated with fibroin biosynthesis in the posterior silk‐gland of Bombyx mori L. , 1977, FEBS letters.

[115]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[116]  K. Pilcher,et al.  Incidence of Clostridium botulinum type E in salmon and other marine fish in the Pacific Northwest. , 1968, Applied microbiology.

[117]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[118]  T. Beacham,et al.  Population structure of chum salmon (Oncorhynchus keta) across the Pacific Rim, determined from microsatellite analysis , 2009 .

[119]  Claude-Alain H. Roten,et al.  Fast and accurate short read alignment with Burrows–Wheeler transform , 2009, Bioinform..

[120]  S. Piertney,et al.  The evolutionary ecology of the major histocompatibility complex , 2006, Heredity.

[121]  S. Otto,et al.  Polyploid incidence and evolution. , 2000, Annual review of genetics.

[122]  L. Seeb,et al.  High Genetic Heterogeneity in Chum Salmon in Western Alaska, the Contact Zone between Northern and Southern Lineages , 1999 .

[123]  A. Hughes,et al.  Natural selection at major histocompatibility complex loci of vertebrates. , 1998, Annual review of genetics.

[124]  J. Bruslé The impact of harmful algal blooms on finfish. Mortality, pathology and toxicology , 1994 .

[125]  F. Allendorf,et al.  Tetraploidy and the Evolution of Salmonid Fishes , 1984 .