Genome sequence of the barred knifejaw Oplegnathus fasciatus (Temminck & Schlegel, 1844): the first chromosome-level draft genome in the family Oplegnathidae

Abstract Background The barred knifejaw (Oplegnathus fasciatus), a member of the Oplegnathidae family of the Centrarchiformes, is a commercially important rocky reef fish native to East Asia. Oplegnathus fasciatus has become an important fishery resource for offshore cage aquaculture and fish stocking of marine ranching in China, Japan, and Korea. Recently, sexual dimorphism in growth with neo-sex chromosome and widespread biotic diseases in O. fasciatus have been increasing concern in the industry. However, adequate genome resources for gaining insight into sex-determining mechanisms and establishing genetically resistant breeding systems for O. fasciatus are lacking. Here, we analyzed the entire genome of a female O. fasciatus fish using long-read sequencing and Hi-C data to generate chromosome-length scaffolds and a highly contiguous genome assembly. Findings We assembled the O. fasciatus genome with a total of 245.0 Gb of raw reads that were generated using both Pacific Bioscience (PacBio) Sequel and Illumina HiSeq 2000 platforms. The final draft genome assembly was approximately 778.7 Mb, which reached a high level of continuity with a contig N50 of 2.1 Mb. The genome size was consistent with the estimated genome size (777.5 Mb) based on k-mer analysis. We combined Hi-C data with a draft genome assembly to generate chromosome-length scaffolds. Twenty-four scaffolds corresponding to the 24 chromosomes were assembled to a final size of 768.8 Mb with a contig N50 of 2.1 Mb and a scaffold N50 of 33.5 Mb using 1,372 contigs. The identified repeat sequences accounted for 33.9% of the entire genome, and 24 003 protein-coding genes with an average of 10.1 exons per gene were annotated using de novo methods, with RNA sequencing data and homologies to other teleosts. According to phylogenetic analysis using protein-coding genes, O. fasciatus is closely related to Larimichthys crocea, with O. fasciatus diverging from their common ancestor approximately 70.5–88.5 million years ago. Conclusions We generated a high-quality draft genome for O. fasciatus using long-read PacBio sequencing technology, which represents the first chromosome-level reference genome for Oplegnathidae species. Assembly of this genome assists research into fish sex-determining mechanisms and can serve as a resource for accelerating genome-assisted improvements in resistant breeding systems.

[1]  Burkhard Morgenstern,et al.  AUGUSTUS: a web server for gene finding in eukaryotes , 2004, Nucleic Acids Res..

[2]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[3]  Carl Kingsford,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..

[4]  Tetsuya Hayashi,et al.  Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads , 2014, Genome research.

[5]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[6]  Li Sun,et al.  Rock bream (Oplegnathus fasciatus) viperin is a virus-responsive protein that modulates innate immunity and promotes resistance against megalocytivirus infection. , 2014, Developmental and comparative immunology.

[7]  Daniel R. Zerbino,et al.  Ensembl 2014 , 2013, Nucleic Acids Res..

[8]  Nansheng Chen,et al.  Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences , 2009, Current protocols in bioinformatics.

[9]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[10]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[11]  Toni Gabaldón,et al.  Redundans: an assembly pipeline for highly heterozygous genomes , 2015, Nucleic acids research.

[12]  Ziheng Yang,et al.  Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. , 2006, Molecular biology and evolution.

[13]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[14]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[15]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[16]  Nicolas Bailly,et al.  Phylogenetic classification of bony fishes , 2017, BMC Evolutionary Biology.

[17]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[18]  S. Koren,et al.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation , 2016, bioRxiv.

[19]  Mi-jung Kim,et al.  Genetic Diversity of Rock Bream Oplegnathus fasciatus in Southern Korea , 2008 .

[20]  Jonathan Pevsner,et al.  Basic Local Alignment Search Tool (BLAST) , 2005 .

[21]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[22]  Romain Koszul,et al.  Contact genomics: scaffolding and phasing (meta)genomes using chromosome 3D physical signatures , 2015, FEBS letters.

[23]  Chon-Kit Kenneth Chan,et al.  Analysis of RNA-Seq Data Using TopHat and Cufflinks. , 2016, Methods in molecular biology.

[24]  Juan Miguel García-Gómez,et al.  BIOINFORMATICS APPLICATIONS NOTE Sequence analysis Manipulation of FASTQ data with Galaxy , 2005 .

[25]  Neva C. Durand,et al.  Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes , 2015, Proceedings of the National Academy of Sciences.

[26]  Andrew C. Adey,et al.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions , 2013, Nature Biotechnology.

[27]  R. Agarwala,et al.  Composition-based statistics and translated nucleotide searches: Improving the TBLASTN module of BLAST , 2006, BMC Biology.

[28]  Han Fang,et al.  "Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions" , 2014 .

[29]  Julie D Thompson,et al.  Multiple Sequence Alignment Using ClustalW and ClustalX , 2003, Current protocols in bioinformatics.

[30]  Zengrong Liu,et al.  Computational Systems Biology Methods in Molecular Biology, Chemistry Biology, Molecular Biomedicine, and Biopharmacy , 2014, BioMed research international.

[31]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[32]  Nansheng Chen,et al.  Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences , 2009, Current protocols in bioinformatics.

[33]  KingsfordCarl,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011 .

[34]  P. Schembri,et al.  Occurrence of barred knifejaw, Oplegnathus fasciatus (Actinopterygii: Perciformes: Oplegnathidae), in Malta (Central Mediterranean) with a discussion on possible modes of entry , 2010 .

[35]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[36]  Zhizhong Xiao,et al.  Pronounced population genetic differentiation in the rock bream Oplegnathus fasciatus inferred from mitochondrial DNA sequences , 2014, Mitochondrial DNA. Part A, DNA mapping, sequencing, and analysis.

[37]  Sudhir Kumar,et al.  Tree of Life Reveals Clock-Like Speciation and Diversification , 2014, Molecular biology and evolution.

[38]  Hyun Suk Park,et al.  Population Genetic Structure of Rock Bream (Oplegnathus fasciatus Temminck & Schlegel, 1884) Revealed by mtDNA COI Sequence in Korea and China , 2018, Ocean Science Journal.

[39]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[40]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[41]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[42]  M. Yandell,et al.  Genome Annotation and Curation Using MAKER and MAKER‐P , 2014, Current protocols in bioinformatics.