Whole Genome Sequencing of the Pirarucu (Arapaima gigas) Supports Independent Emergence of Major Teleost Clades

Abstract The Pirarucu (Arapaima gigas) is one of the world’s largest freshwater fishes and member of the superorder Osteoglossomorpha (bonytongues), one of the oldest lineages of ray-finned fishes. This species is an obligate air-breather found in the basin of the Amazon River with an attractive potential for aquaculture. Its phylogenetic position among bony fishes makes the Pirarucu a relevant subject for evolutionary studies of early teleost diversification. Here, we present, for the first time, a draft genome version of the A. gigas genome, providing useful information for further functional and evolutionary studies. The A. gigas genome was assembled with 103-Gb raw reads sequenced in an Illumina platform. The final draft genome assembly was ∼661 Mb, with a contig N50 equal to 51.23 kb and scaffold N50 of 668 kb. Repeat sequences accounted for 21.69% of the whole genome, and a total of 24,655 protein-coding genes were predicted from the genome assembly, with an average of nine exons per gene. Phylogenomic analysis based on 24 fish species supported the postulation that Osteoglossomorpha and Elopomorpha (eels, tarpons, and bonefishes) are sister groups, both forming a sister lineage with respect to Clupeocephala (remaining teleosts). Divergence time estimations suggested that Osteoglossomorpha and Elopomorpha lineages emerged independently in a period of ∼30 Myr in the Jurassic. The draft genome of A. gigas provides a valuable genetic resource for further investigations of evolutionary studies and may also offer a valuable data for economic applications.

[1]  K. Koshiba-Takeuchi,et al.  Significance of whole-genome duplications on the emergence of evolutionary novelties , 2018, Briefings in functional genomics.

[2]  Nicolás Bellora,et al.  Comprehensive phylogeny of ray-finned fishes (Actinopterygii) based on transcriptomic and genomic data , 2018, Proceedings of the National Academy of Sciences.

[3]  J. Cañizares,et al.  Large scale gene duplication affected the European eel (Anguilla anguilla) after the 3R teleost duplication , 2017, bioRxiv.

[4]  Shujun Ou,et al.  LTR_retriever: A Highly Accurate and Sensitive Program for Identification of Long Terminal Repeat Retrotransposons1[OPEN] , 2017, Plant Physiology.

[5]  Brent S. Pedersen,et al.  Mosdepth: quick coverage calculation for genomes and exomes , 2017, bioRxiv.

[6]  N. Jiang,et al.  LTR_retriever: a highly accurate and sensitive program for identification of LTR retrotransposons , 2017, bioRxiv.

[7]  J. Postlethwait,et al.  Gene evolution and gene expression after whole genome duplication in fish: the PhyloFish database , 2016, BMC Genomics.

[8]  Jose V. Lopez,et al.  Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies , 2016, GigaScience.

[9]  Huanming Yang,et al.  The Asian arowana (Scleropages formosus) genome provides new insights into the evolution of an early lineage of teleosts , 2016, Scientific Reports.

[10]  Sudhir Kumar,et al.  MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. , 2016, Molecular biology and evolution.

[11]  Toni Gabaldón,et al.  Redundans: an assembly pipeline for highly heterozygous genomes , 2015, Nucleic acids research.

[12]  Ziheng Yang,et al.  Uncertainty in the Timing of Origin of Animals and the Limits of Precision in Molecular Timescales , 2015, Current Biology.

[13]  Shaadi Mehr,et al.  Adaptive Evolution of Eel Fluorescent Proteins from Fatty Acid Binding Proteins Produces Bright Fluorescence in the Marine Environment , 2015, PloS one.

[14]  Peng Zhang,et al.  Selecting Question-Specific Genes to Reduce Incongruence in Phylogenomics: A Case Study of Jawed Vertebrate Backbone Phylogeny. , 2015, Systematic biology.

[15]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[16]  C. Austin,et al.  Whole Genome Sequencing of the Asian Arowana (Scleropages formosus) Provides Insights into the Evolution of Ray-Finned Fishes , 2015, Genome biology and evolution.

[17]  S. Kelly,et al.  OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy , 2015, Genome Biology.

[18]  O. Kohany,et al.  Repbase Update, a database of repetitive elements in eukaryotic genomes , 2015, Mobile DNA.

[19]  Sudhir Kumar,et al.  Tree of Life Reveals Clock-Like Speciation and Diversification , 2014, Molecular biology and evolution.

[20]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[21]  Qiang Li,et al.  Genome sequence and genetic diversity of the common carp, Cyprinus carpio , 2014, Nature Genetics.

[22]  Joshua B. Gross,et al.  The cavefish genome reveals candidate genes for eye loss , 2014, Nature Communications.

[23]  Eric S. Lander,et al.  The genomic substrate for adaptive radiation in African cichlid fish , 2014, Nature.

[24]  Stephen A. Smith,et al.  Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics , 2014, Molecular biology and evolution.

[25]  J. Townsend,et al.  Phylogenetic informativeness reconciles ray-finned fish molecular divergence times , 2014, BMC Evolutionary Biology.

[26]  S. Neuhauss,et al.  Whole-genome duplication in teleost fishes and its evolutionary consequences , 2014, Molecular Genetics and Genomics.

[27]  Tianjun Xu,et al.  A Mitogenomic Perspective on the Phylogenetic Position of the Hapalogenys Genus (Acanthopterygii: Perciformes) and the Evolutionary Origin of Perciformes , 2014, PloS one.

[28]  Michael R. Sussman,et al.  Genomic basis for the convergent evolution of electric organs , 2014, Science.

[29]  Dong Xie,et al.  BEAST 2: A Software Platform for Bayesian Evolutionary Analysis , 2014, PLoS Comput. Biol..

[30]  Matthew Fraser,et al.  InterProScan 5: genome-scale protein function classification , 2014, Bioinform..

[31]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[32]  Brian J. Raney,et al.  Elephant shark genome provides unique insights into gnathostome evolution , 2014, Nature.

[33]  Rainer Froese,et al.  FishBase. World Wide Web electronic publication. , 2014 .

[34]  A. R. Caetano,et al.  Bulked segregant analysis of the pirarucu (Arapaima gigas) genome for identification of sex-specific molecular markers. , 2013, Genetics and molecular research : GMR.

[35]  Wei‐Jen Chen,et al.  EVOLUTIONARY ORIGIN AND EARLY BIOGEOGRAPHY OF OTOPHYSAN FISHES (OSTARIOPHYSI: TELEOSTEI) , 2013, Evolution; international journal of organic evolution.

[36]  Mira V. Han,et al.  Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. , 2013, Molecular biology and evolution.

[37]  Colin N. Dewey,et al.  De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis , 2013, Nature Protocols.

[38]  Thaine W. Rowley,et al.  The Tree of Life and a New Classification of Bony Fishes , 2013, PLoS currents.

[39]  Anton J. Enright,et al.  The zebrafish reference genome sequence and its relationship to the human genome , 2013, Nature.

[40]  G. Ortí,et al.  Multi-locus phylogenetic analysis reveals the pattern and tempo of bony fish evolution , 2013, PLoS currents.

[41]  Alexey A. Gurevich,et al.  QUAST: quality assessment tool for genome assemblies , 2013, Bioinform..

[42]  Angel Amores,et al.  The genome of the platyfish, Xiphophorus maculatus, provides insights into evolutionary adaptation and several complex traits , 2013, Nature Genetics.

[43]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[44]  B. Faircloth,et al.  A Phylogenomic Perspective on the Radiation of Ray-Finned Fishes Based upon Targeted Sequencing of Ultraconserved Elements (UCEs) , 2012, PloS one.

[45]  Kevin Vanneste,et al.  Inference of genome duplications from age distributions revisited. , 2013, Molecular biology and evolution.

[46]  Jian Wang,et al.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler , 2012, GigaScience.

[47]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[48]  Peter C. Wainwright,et al.  Resolution of ray-finned fish phylogeny and timing of diversification , 2012, Proceedings of the National Academy of Sciences.

[49]  Zhanjiang Liu,et al.  Profiling of gene duplication patterns of sequenced teleost genomes: evidence for rapid lineage-specific genome expansion mediated by recent tandem duplications , 2012, BMC Genomics.

[50]  Alex A. Pollen,et al.  The genomic basis of adaptive evolution in threespine sticklebacks , 2012, Nature.

[51]  Katsumi Tsukamoto,et al.  Primitive Duplicate Hox Clusters in the European Eel's Genome , 2012, PloS one.

[52]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[53]  Inge Jonassen,et al.  The genome sequence of Atlantic cod reveals a unique immune system , 2011, Nature.

[54]  M. Miya,et al.  Evolutionary history of Otophysi (Teleostei), a major clade of the modern freshwater fishes: Pangaean origin and Mesozoic radiation , 2011, BMC Evolutionary Biology.

[55]  S. Brenner,et al.  Integration of the Genetic Map and Genome Assembly of Fugu Facilitates Insights into Distinct Features of Genome Evolution in Teleosts and Mammals , 2011, Genome biology and evolution.

[56]  Carl Kingsford,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..

[57]  Carsten O. Daub,et al.  SAMStat: monitoring biases in next generation sequencing data , 2010, Bioinform..

[58]  W. Wijnstekers The Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) - 35 Years of Global Efforts to Ensure That International Trade in Wild Animals and Plants Is Legal and Sustainable. , 2011, Forensic science review.

[59]  B. Venkatesh,et al.  Evolutionary origin and phylogeny of the modern holocephalans (Chondrichthyes: Chimaeriformes): a mitogenomic perspective. , 2010, Molecular biology and evolution.

[60]  Miklós Csuös,et al.  Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood , 2010, Bioinform..

[61]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[62]  F. Kondrashov,et al.  The evolution of gene duplications: classifying and distinguishing between models , 2010, Nature Reviews Genetics.

[63]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[64]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[65]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[66]  Jeffery P. Demuth,et al.  The life and death of gene families , 2009, BioEssays : news and reviews in molecular, cellular and developmental biology.

[67]  J. Renno,et al.  Gender determination in the Paiche or Pirarucu (Arapaima gigas) using plasma vitellogenin, 17β-estradiol, and 11-ketotestosterone levels , 2009, Fish Physiology and Biochemistry.

[68]  M. Miya,et al.  Mitogenomic evaluation of the historical biogeography of cichlids toward reliable dating of teleostean divergences , 2008, BMC Evolutionary Biology.

[69]  David Haussler,et al.  Using native and syntenically mapped cDNA alignments to improve de novo gene finding , 2008, Bioinform..

[70]  A. Murray,et al.  Osteoglossomorpha: phylogeny, biogeography, and fossil record and the significance of key African and Chinese fossil taxa , 2008 .

[71]  M. Crossa,et al.  Conservation strategies for Arapaima gigas (Schinz, 1822) and the Amazonian várzea ecosystem. , 2007, Brazilian journal of biology = Revista brasleira de biologia.

[72]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[73]  Gerard Talavera,et al.  Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. , 2007, Systematic biology.

[74]  Fumiko Ohta,et al.  The medaka draft genome and insights into vertebrate genome evolution , 2007, Nature.

[75]  Zhao Xu,et al.  LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons , 2007, Nucleic Acids Res..

[76]  M. Borodovsky,et al.  Gene identification in novel eukaryotic genomes by self-training algorithm , 2005, Nucleic acids research.

[77]  S. Hedges,et al.  Molecular phylogeny and divergence times of deuterostome animals. , 2005, Molecular biology and evolution.

[78]  H. P. Godinho,et al.  Gonadal morphology and reproductive traits of the Amazonian fish Arapaima gigas (Schinz, 1822) , 2005 .

[79]  Juan Miguel García-Gómez,et al.  BIOINFORMATICS APPLICATIONS NOTE Sequence analysis Manipulation of FASTQ data with Galaxy , 2005 .

[80]  A. Meyer,et al.  Population genetic analysis of Arapaima gigas, one of the largest freshwater fishes of the Amazon basin: implications for its conservation , 2005 .

[81]  B. Venkatesh,et al.  The mitochondrial genome of Indonesian coelacanth Latimeria menadoensis (Sarcopterygii: Coelacanthiformes) and divergence time estimation between the two coelacanths. , 2005, Gene.

[82]  Charles E. Chapple,et al.  Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype , 2004, Nature.

[83]  S. Lavoué,et al.  Simultaneous analysis of five molecular markers provides a well-supported phylogenetic hypothesis for the living bony-tongue fishes (Osteoglossomorpha: Teleostei). , 2004, Molecular phylogenetics and evolution.

[84]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[85]  J. M. Wilson,et al.  Transition in organ function during the evolution of air-breathing; insights from Arapaima gigas, an obligate air-breathing teleost from the Amazon , 2004, Journal of Experimental Biology.

[86]  Nansheng Chen,et al.  Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences , 2009, Current protocols in bioinformatics.

[87]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[88]  G. Barlow,et al.  Fishes of the world , 2004, Environmental Biology of Fishes.

[89]  Jianzhi Zhang Evolution by gene duplication: an update , 2003 .

[90]  Katsumi Tsukamoto,et al.  Basal actinopterygian relationships: a mitogenomic perspective on the phylogeny of the "ancient fish". , 2003, Molecular phylogenetics and evolution.

[91]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[92]  E. Hilton A contribution to the comparative osteology and phylogenetic systematics of fossil and living bony -tongue fishes (Actinopterygii, Teleostei, Osteoglossomorpha) , 2002 .

[93]  M. Nishida,et al.  Molecular phylogeny of osteoglossoids: a new model for Gondwanian origin and plate tectonic transportation of the Asian arowana. , 2000, Molecular biology and evolution.

[94]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[95]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[96]  G. Arratia Basal teleosts and teleostean phylogeny , 1997 .

[97]  P. Marsden Letter from Brasilia: Some primitive peoples of the tropics , 1994 .

[98]  F. Tajima,et al.  Simple methods for testing the molecular evolutionary clock hypothesis. , 1993, Genetics.

[99]  G. Lecointre,et al.  A 28S rRNA-based phylogeny of the gnathostomes: first steps in the analysis of conflict and congruence with morphologically based cladograms. , 1993, Molecular phylogenetics and evolution.

[100]  G. Lauder,et al.  KINEMATICS OF THE TONGUE-BITE APPARATUS IN OSTEOGLOSSOMORPH FISHES , 1990 .

[101]  J. S. Nelson,et al.  Fishes of the world. , 1978 .

[102]  D. Rosen,et al.  Review of ichthyodectiform and other Mesozoic teleost fishes, and the theory and practice of classifying fossils. Bulletin of the AMNH ; v. 158, article 2 , 1977 .

[103]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.