It’s okay to be green: Draft genome of the North American bullfrog (Rana [Lithobates] catesbeiana)

Frogs play important ecological roles as sentinels, insect control and food sources. Several species are important model organisms for scientific research to study embryogenesis, development, immune function, and endocrine signaling. The globally-distributed Ranidae (true frogs) are the largest frog family, and have substantial evolutionary distance from the model laboratory Xenopus frog species. Consequently, the extensive Xenopus genomic resources are of limited utility for Ranids and related frog species. More widely applicable amphibian genomic data is urgently needed as more than two-thirds of known species are currently threatened or are undergoing population declines. Herein, we report on the first genome sequence of a Ranid species, an adult male North American bullfrog (Rana [Lithobates] catesbeiana). We assembled high-depth Illumina reads (66-fold coverage), into a 5.8 Gbp (NG50 = 57.7 kbp) draft genome using ABySS v1.9.0. The assembly was scaffolded with LINKS and RAILS using pseudo-long-reads from targeted denovo assembler Kollector and Illumina Synthetic Long-Reads, as well as reads from long fragment (MPET) libraries. We predicted over 22,000 protein-coding genes using the MAKER2 pipeline and identified the genomic loci of 6,227 candidate long noncoding RNAs (IncRNAs) from a composite reference bullfrog transcriptome. Mitochondrial sequence analysis supported Lithobates as a subgenus of Rana. RNA-Seq experiments identified ~6,000 thyroid hormone– responsive transcripts in the back skin of premetamorphic tadpoles; the majority of which regulate DNA/RNA processing. Moreover, 1/6th of differentially-expressed transcripts were putative lncRNAs. Our draft bullfrog genome will serve as a useful resource for the amphibian research community.

[1]  Justin Chu,et al.  Kollector: transcript-informed, targeted de novo assembly of gene loci , 2017, Bioinform..

[2]  Ken W. Y. Cho,et al.  Developmentally regulated long non-coding RNAs in Xenopus tropicalis , 2017, Developmental biology.

[3]  René L. Warren,et al.  RAILS and Cobbler: Scaffolding and automated finishing of draft genomes using long DNA sequences , 2016, J. Open Source Softw..

[4]  Kevin A. Burns,et al.  Genome evolution in the allotetraploid frog Xenopus laevis , 2016, Nature.

[5]  D. Hillis,et al.  Spatiotemporal Diversification of the True Frogs (Genus Rana): A Historical Framework for a Widely Studied Group of Model Organisms. , 2016, Systematic biology.

[6]  N. Veldhoen,et al.  Rethinking the biological relationships of the thyroid hormones, l-thyroxine and 3,5,3'-triiodothyronine. , 2016, Comparative biochemistry and physiology. Part D, Genomics & proteomics.

[7]  Sudhir Kumar,et al.  MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. , 2016, Molecular biology and evolution.

[8]  Lei Chen,et al.  Genome-wide analysis of long non-coding RNAs at early stage of skin pigmentation in goats (Capra hircus) , 2016, BMC Genomics.

[9]  Ismail Moghul,et al.  GeneValidator: identify problems with protein-coding gene predictions , 2016, Bioinform..

[10]  Howard Y. Chang,et al.  Unique features of long non-coding RNA biogenesis and function , 2015, Nature Reviews Genetics.

[11]  K. Etebari,et al.  Genome wide discovery of long intergenic non-coding RNAs in Diamondback moth (Plutella xylostella) and their expression in insecticide resistant strains , 2015, Scientific Reports.

[12]  Justin Chu,et al.  Konnector v2.0: pseudo-long reads from paired-end sequencing data , 2015, BMC Medical Genomics.

[13]  E. Liu,et al.  Xenopus tropicalis Genome Re-Scaffolding and Re-Annotation Reach the Resolution Required for In Vivo ChIA-PET Analysis , 2015, PloS one.

[14]  Steven J. M. Jones,et al.  LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads , 2015, GigaScience.

[15]  N. Veldhoen,et al.  Influence of temperature on thyroid hormone signaling and endocrine disruptor action in Rana (Lithobates) catesbeiana tadpoles. , 2015, General and comparative endocrinology.

[16]  René L. Warren,et al.  Sealer: a scalable gap-closing application for finishing draft genomes , 2015, BMC Bioinformatics.

[17]  Steven J. M. Jones,et al.  Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. , 2015, The Plant journal : for cell and molecular biology.

[18]  Inanc Birol,et al.  De novo Transcriptome Assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes , 2015, PloS one.

[19]  Robert D. Finn,et al.  HMMER web server: 2015 update , 2015, Nucleic Acids Res..

[20]  J. Vandesompele,et al.  An update on LNCipedia: a database for annotated human lncRNA sequences , 2015, Nucleic Acids Res..

[21]  Shu-lin Yang,et al.  Systematic identification and characterization of long intergenic non-coding RNAs in fetal porcine skeletal muscle development , 2015, Scientific Reports.

[22]  Jun Wang,et al.  Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes , 2015, Proceedings of the National Academy of Sciences.

[23]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[24]  Peter B. McGarvey,et al.  UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches , 2014, Bioinform..

[25]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[26]  Marcel E. Dinger,et al.  lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs , 2014, Nucleic Acids Res..

[27]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[28]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[29]  Xia Li,et al.  RAID: a comprehensive resource for human RNA-associated (RNA–RNA/RNA–protein) interaction , 2014, RNA.

[30]  D. Gautheret,et al.  Identification of large intergenic non-coding RNAs in bovine muscle using next-generation transcriptomic sequencing , 2014, BMC Genomics.

[31]  C. Helbing,et al.  Evaluation of the effects of titanium dioxide nanoparticles on cultured Rana catesbeiana tailfin tissue , 2013, Front. Genet..

[32]  R. Weikard,et al.  Identification of novel transcripts and noncoding RNAs in bovine skin by deep next generation sequencing , 2013, BMC Genomics.

[33]  H. Kondoh,et al.  Sox proteins: regulators of cell fate specification and differentiation , 2013, Development.

[34]  A. Bhan,et al.  Antisense transcript long noncoding RNA (lncRNA) HOTAIR is transcriptionally induced by estradiol. , 2013, Journal of molecular biology.

[35]  C. Alonso,et al.  The regulation of Hox gene expression during animal development , 2013, Development.

[36]  Inanç Birol,et al.  Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data , 2013, Bioinform..

[37]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[38]  J. Rohr,et al.  Disease and thermal acclimation in a more variable and unpredictable climate , 2013 .

[39]  L. Olsson,et al.  Cranial muscles in amphibians: development, novelties and the role of cranial neural crest cells , 2013, Journal of anatomy.

[40]  Jason Chuang,et al.  RNA sequencing reveals a diverse and dynamic repertoire of the Xenopus tropicalis transcriptome over development , 2012, Genome research.

[41]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[42]  H. Khatib,et al.  Genome-wide identification and initial characterization of bovine long non-coding RNAs from EST data. , 2012, Animal genetics.

[43]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[44]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[45]  Howard Y. Chang,et al.  Genome regulation by long noncoding RNAs. , 2012, Annual review of biochemistry.

[46]  K. Lips,et al.  Genetic diversity of MHC class I loci in six non-model frogs is shaped by positive selection and gene duplication , 2012, Heredity.

[47]  Tingting Li,et al.  Identification of long non-protein coding RNAs in chicken skeletal muscle using next generation sequencing. , 2012, Genomics.

[48]  C. Helbing The Metamorphosis of Amphibian Toxicogenomics , 2012, Front. Gene..

[49]  T. M. Edwards,et al.  Influence of Nitrate and Nitrite on Thyroid Hormone Responsive and Stress-Associated Gene Expression in Cultured Rana catesbeiana Tadpole Tail Fin Tissue , 2012, Front. Gene..

[50]  Hui Xiao,et al.  NONCODE v3.0: integrative annotation of long noncoding RNAs , 2011, Nucleic Acids Res..

[51]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[52]  J. Brunner,et al.  Ranavirus: past, present and future , 2011, Biology Letters.

[53]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[54]  Matko Bosnjak,et al.  REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms , 2011, PloS one.

[55]  C. Helbing,et al.  Effects of triclocarban, triclosan, and methyl triclosan on thyroid hormone action and stress in frog and mammalian culture systems. , 2011, Environmental science & technology.

[56]  L. Aravind,et al.  A novel immunity system for bacterial nucleic acid degrading toxins and its recruitment in various eukaryotic and DNA viral systems , 2011, Nucleic acids research.

[57]  R. Holt,et al.  Targeted Assembly of Short Sequence Reads , 2011, PloS one.

[58]  Steven J. M. Jones,et al.  De novo assembly and analysis of RNA-seq data , 2010, Nature Methods.

[59]  Arnold J. Levine,et al.  Unexpected Inheritance: Multiple Integrations of Ancient Bornavirus and Ebolavirus/Marburgvirus Sequences in Vertebrate Genomes , 2010, PLoS pathogens.

[60]  Russell B. Fletcher,et al.  The Genome of the Western Clawed Frog Xenopus tropicalis , 2010, Science.

[61]  T. Hayes,et al.  The cause of global amphibian declines: a developmental endocrinologist's perspective , 2010, Journal of Experimental Biology.

[62]  C. Helbing,et al.  C‐fin: A cultured frog tadpole tail fin biopsy approach for detection of thyroid hormone‐disrupting chemicals , 2010, Environmental toxicology and chemistry.

[63]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[64]  D. Hillis,et al.  Taxonomic Freedom and the Role of Official Lists of Species Names , 2009 .

[65]  Yiming Li,et al.  Aquaculture Enclosures Relate to the Establishment of Feral Populations of Introduced Species , 2009, PloS one.

[66]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[67]  K. Kashiwagi,et al.  Molecular features of thyroid hormone‐regulated skin remodeling in Xenopus laevis during metamorphosis , 2009, Development, growth & differentiation.

[68]  C. Ponting,et al.  Evolution and Functions of Long Noncoding RNAs , 2009, Cell.

[69]  Keith Bradnam,et al.  Assessing the gene space in draft genomes , 2008, Nucleic acids research.

[70]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[71]  N. Veldhoen,et al.  Roscovitine inhibits thyroid hormone‐induced tail regression of the frog tadpole and reveals a role for cyclin C/Cdk8 in the establishment of the metamorphic gene expression program , 2008, Developmental dynamics : an official publication of the American Association of Anatomists.

[72]  M. Borodovsky,et al.  Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. , 2008, Genome research.

[73]  Paulo P. Amaral,et al.  Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. , 2008, Genome research.

[74]  Sofia M. C. Robb,et al.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. , 2007, Genome research.

[75]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[76]  Yong Zhang,et al.  CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine , 2007, Nucleic Acids Res..

[77]  Yunbo Shi,et al.  Pairing morphology with gene expression in thyroid hormone-induced intestinal remodeling and identification of a core set of TH-induced genes across tadpole tissues. , 2007, Developmental biology.

[78]  Brian S. Clark,et al.  The Evf-2 noncoding RNA is transcribed from the Dlx-5/6 ultraconserved region and functions as a Dlx-2 transcriptional coactivator. , 2006, Genes & development.

[79]  W. Ying A Study on the Chromosomes in Bullfrog Rana catesbeiana , 2006 .

[80]  P. Moler,et al.  THE AMPHIBIAN TREE OF LIFE , 2006 .

[81]  Burkhard Morgenstern,et al.  Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources , 2006, BMC Bioinformatics.

[82]  K. Williams,et al.  Dendritic BC1 RNA in translational control mechanisms , 2005, The Journal of cell biology.

[83]  S. Batalov,et al.  A Strategy for Probing the Function of Noncoding RNAs Finds a Repressor of NFAT , 2005, Science.

[84]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[85]  S. Chuang,et al.  Identification and characterization of a novel gene Saf transcribed from the opposite strand of Fas. , 2005, Human molecular genetics.

[86]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[87]  A. L. Mazin Amounts of nuclear DNA in anurans of the USSR , 1980, Experientia.

[88]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[89]  C. Eggert Sex determination: the amphibian models. , 2004, Reproduction, nutrition, development.

[90]  C. Kanduri,et al.  An Antisense RNA Regulates the Bidirectional Silencing Property of the Kcnq1 Imprinting Control Region , 2004, Molecular and Cellular Biology.

[91]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[92]  A. Kurabayashi,et al.  Sequencing and analysis of the internal transcribed spacers (ITSs) and coding regions in the EcoR I fragment of the ribosomal DNA of the Japanese pond frog Rana nigromaculata. , 2004, Genes & genetic systems.

[93]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[94]  Rodrigo Lopez,et al.  Multiple sequence alignment with the Clustal series of programs , 2003, Nucleic Acids Res..

[95]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[96]  D. Gautheret,et al.  Patterns of variant polyadenylation signal usage in human genes. , 2000, Genome research.

[97]  M. Nei,et al.  Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. , 1993, Molecular biology and evolution.

[98]  J. Rohozinski,et al.  A frog virus 3 gene codes for a protein containing the motif characteristic of the INT family of integrases. , 1992, Virology.

[99]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[100]  A. Loveridge,et al.  Breeding, Rearing and Care of the South African Clawed Frog (Xenopus laevis) , 1947, American Naturalist.

[101]  H Shindo,et al.  Nucleic Acids , 1932, Nature.