The first draft genome assembly of Snow sheep (Ovis nivicola).

The snow sheep, Ovis nivicola, which is endemic to the mountain ranges of northeastern Siberia, are well-adapted to the harsh cold climatic conditions of their habitat. In this study, using long reads of Nanopore sequencing technology, whole genome sequencing, assembly and gene annotation of a snow sheep was carried out. Additionally, RNA-seq reads from several tissues were also generated to supplement the gene prediction in Snow sheep genome. The assembled genome was ∼2.62 Gb in length and was represented by 7,157 scaffolds with N50 of about 2 Mb. The repetitive sequences comprised of 41% of the total genome. BUSCO analysis revealed that the snow sheep assembly contained full-length or partial fragments of 97% of mammalian universal single-copy orthologs (n = 4,104), illustrating the completeness of the assembly. In addition, a total of 20,045 protein coding sequences were identified using comprehensive gene prediction pipeline. Of which 19,240 (∼96%) sequences were annotated using protein databases. Moreover, homology-based searches and de-novo identification detected 1,484 tRNAs, 243 rRNAs, 1,931 snRNAs, and 782 miRNAs in the snow sheep genome. To conclude, we generated the first de novo genome of the snow sheep using long reads; these data are expected to contribute significantly to our understanding related to evolution and adaptation within the Ovis genus.

[1]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[2]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[3]  S. Moore,et al.  Harnessing cross-species alignment to discover SNPs and generate a draft genome sequence of a bighorn sheep (Ovis canadensis) , 2015, BMC Genomics.

[4]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[5]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[6]  R. Guigó,et al.  GeneID in Drosophila. , 2000, Genome research.

[7]  Heng Li,et al.  Fast and accurate long-read assembly with wtdbg2 , 2019, Nature Methods.

[8]  Peter F. Hallin,et al.  RNAmmer: consistent and rapid annotation of ribosomal RNA genes , 2007, Nucleic acids research.

[9]  Patricia P. Chan,et al.  GtRNAdb 2.0: an expanded database of transfer RNA genes identified in complete and draft genomes , 2015, Nucleic Acids Res..

[10]  P. Chomczyński,et al.  Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. , 1987, Analytical biochemistry.

[11]  Sean R. Eddy,et al.  Infernal 1.1: 100-fold faster RNA homology searches , 2013, Bioinform..

[12]  D. Gilbert Genes of the pig, Sus scrofa, reconstructed with EvidentialGene , 2018, bioRxiv.

[13]  A. Kramarenko,et al.  Characteristics of the Genetic Structure of Snow Sheep (Ovis nivicola lydekkeri) of the Verkhoyansk Mountain Chain , 2018, Russian Journal of Genetics.

[14]  Matthew W. Hahn,et al.  AGOUTI: improving genome assembly and annotation using transcriptome data , 2015, bioRxiv.

[15]  M. Barbato,et al.  The first complete mitochondrial genomes of snow sheep (Ovis nivicola) and thinhorn sheep (Ovis dalli) and their phylogenetic implications for the genus Ovis , 2019 .

[16]  R. Etchberger Review of The Wild Sheep of the World , 1987 .

[17]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[18]  C. F. Nadler,et al.  G-band patterns as chromosomal markers, and the interpretation of chromosomal evolution in wild sheep (Ovis) , 1973, Experientia.

[19]  C. Wade,et al.  The sheep genome reference sequence: a work in progress. , 2010, Animal genetics.

[20]  Jonathan E. Allen,et al.  Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments , 2007, Genome Biology.

[21]  Nansheng Chen,et al.  Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences , 2009, Current protocols in bioinformatics.

[22]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[23]  Niranjan Nagarajan,et al.  Fast and accurate de novo genome assembly from long uncorrected reads. , 2017, Genome research.

[24]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[25]  Yutao Wang,et al.  Draft genome of the Marco Polo Sheep (Ovis ammon polii) , 2017, GigaScience.

[26]  Davide Heller,et al.  eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses , 2018, Nucleic Acids Res..

[27]  T. Deniskova,et al.  Whole genome SNP scanning of snow sheep (Ovis nivicola) , 2016, Doklady Biochemistry and Biophysics.

[28]  P. Taberlet,et al.  Evolution and taxonomy of the wild species of the genus Ovis (Mammalia, Artiodactyla, Bovidae). , 2010, Molecular phylogenetics and evolution.

[29]  Matthew Fraser,et al.  InterProScan 5: genome-scale protein function classification , 2014, Bioinform..

[30]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[31]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[32]  Erich Bornberg-Bauer,et al.  DOGMA: domain-based transcriptome and proteome quality assessment , 2016, Bioinform..

[33]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[34]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[35]  Burkhard Morgenstern,et al.  AUGUSTUS: ab initio prediction of alternative transcripts , 2006, Nucleic Acids Res..

[36]  Hiroaki Iwata,et al.  Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features , 2012, Nucleic acids research.

[37]  Jinyang Zhao,et al.  Genome sequencing of the sweetpotato whitefly Bemisia tabaci MED/Q , 2017, GigaScience.

[38]  Patricia P. Chan,et al.  tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes , 2019, bioRxiv.

[39]  Genetic characteristics of Kodar snow sheep using SNP markers , 2017, Contemporary Problems of Ecology.

[40]  Luis Pedro Coelho,et al.  Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper , 2016, bioRxiv.

[41]  G. Brem,et al.  Genome‐wide SNP analysis unveils genetic structure and phylogeographic history of snow sheep (Ovis nivicola) populations inhabiting the Verkhoyansk Mountains and Momsky Ridge (northeastern Siberia) , 2018, Ecology and evolution.