Chromosome-level genome assembly, annotation, and phylogenomics of the gooseneck barnacle Pollicipes pollicipes

Abstract Background The barnacles are a group of >2,000 species that have fascinated biologists, including Darwin, for centuries. Their lifestyles are extremely diverse, from free-swimming larvae to sessile adults, and even root-like endoparasites. Barnacles also cause hundreds of millions of dollars of losses annually due to biofouling. However, genomic resources for crustaceans, and barnacles in particular, are lacking. Results Using 62× Pacific Biosciences coverage, 189× Illumina whole-genome sequencing coverage, 203× HiC coverage, and 69× CHi-C coverage, we produced a chromosome-level genome assembly of the gooseneck barnacle Pollicipes pollicipes. The P. pollicipes genome is 770 Mb long and its assembly is one of the most contiguous and complete crustacean genomes available, with a scaffold N50 of 47 Mb and 90.5% of the BUSCO Arthropoda gene set. Using the genome annotation produced here along with transcriptomes of 13 other barnacle species, we completed phylogenomic analyses on a nearly 2 million amino acid alignment. Contrary to previous studies, our phylogenies suggest that the Pollicipedomorpha is monophyletic and sister to the Balanomorpha, which alters our understanding of barnacle larval evolution and suggests homoplasy in a number of naupliar characters. We also compared transcriptomes of P. pollicipes nauplius larvae and adults and found that nearly one-half of the genes in the genome are differentially expressed, highlighting the vastly different transcriptomes of larvae and adult gooseneck barnacles. Annotation of the genes with KEGG and GO terms reveals that these stages exhibit many differences including cuticle binding, chitin binding, microtubule motor activity, and membrane adhesion. Conclusion This study provides high-quality genomic resources for a key group of crustaceans. This is especially valuable given the roles P. pollicipes plays in European fisheries, as a sentinel species for coastal ecosystems, and as a model for studying barnacle adhesion as well as its key position in the barnacle tree of life. A combination of genomic, phylogenetic, and transcriptomic analyses here provides valuable insights into the evolution and development of barnacles.

[1]  J. Høeg,et al.  Independent and adaptive evolution of phenotypic novelties driven by coral symbiosis in barnacle larvae , 2021, Evolution; international journal of organic evolution.

[2]  Felipe A. Simão,et al.  BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes , 2021, Molecular biology and evolution.

[3]  M. Pérez‐Losada,et al.  The evolutionary diversity of barnacles, with an updated classification of fossil and living forms , 2021, Zoological Journal of the Linnean Society.

[4]  Maite Huarte,et al.  Gene regulation by long non-coding RNAs and its biological functions , 2020, Nature reviews. Molecular cell biology.

[5]  Anushya Muruganujan,et al.  The Gene Ontology resource: enriching a GOld mine , 2020, Nucleic Acids Res..

[6]  Kun Wang,et al.  Chromosome‐level genome assembly of Paralithodes platypus provides insights into evolution and adaptation of king crabs , 2020, Molecular ecology resources.

[7]  K. Crandall,et al.  A new molecular phylogeny-based taxonomy of parasitic barnacles (Crustacea: Cirripedia: Rhizocephala) , 2020 .

[8]  Sergey Koren,et al.  Towards complete and error-free genome assemblies of all vertebrate species , 2020, Nature.

[9]  Minghua Wang,et al.  The genome of the harpacticoid copepod Tigriopus japonicus: Potential for its use in marine molecular ecotoxicology. , 2020, Aquatic toxicology.

[10]  InterProScan , 2020, Definitions.

[11]  Paula M. Mabee,et al.  Corrigendum: The Extended Specimen Network: A Strategy to Enhance US Biodiversity Collections, Promote Research and Education , 2020, Bioscience.

[12]  J. Vandesompele,et al.  On the utility of RNA sample pooling to optimize cost and statistical power in RNA sequencing experiments , 2020, BMC Genomics.

[13]  Alex R. Hardisty,et al.  FAIR Data and Services in Biodiversity Science and Geoscience , 2020, Data Intelligence.

[14]  Jee-Hoon Kim,et al.  Draft Genome Assembly of a Fouling Barnacle, Amphibalanus amphitrite (Darwin, 1854): The First Reference Genome for Thecostraca , 2019, Front. Ecol. Evol..

[15]  Olga Chernomor,et al.  IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era , 2019, bioRxiv.

[16]  Jennifer Lu,et al.  Improved metagenomic analysis with Kraken 2 , 2019, Genome Biology.

[17]  R. Togawa,et al.  Transcriptome and gene expression analysis of three developmental stages of the coffee berry borer, Hypothenemus hampei , 2019, Scientific Reports.

[18]  M. Pérez‐Losada,et al.  Towards a barnacle tree of life: integrating diverse phylogenetic efforts into a comprehensive hypothesis of thecostracan evolution , 2019, PeerJ.

[19]  Steven L Salzberg,et al.  Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype , 2019, Nature Biotechnology.

[20]  T. Iliffe,et al.  Pancrustacean Evolution Illuminated by Taxon-Rich Genomic-Scale Data Sets with an Expanded Remipede Sampling , 2019, Genome biology and evolution.

[21]  E. Sarropoulou,et al.  An important resource for understanding bio-adhesion mechanisms: Cement gland transcriptomes of two goose barnacles, Pollicipes pollicipes and Lepas anatifera (Cirripedia, Thoracica) , 2019, Marine Genomics.

[22]  Brian A. Nosek,et al.  Make scientific data FAIR , 2019, Nature.

[23]  Jun Chul Park,et al.  The genome of the freshwater water flea Daphnia magna: A potential use for freshwater molecular ecotoxicology. , 2019, Aquatic toxicology.

[24]  Hiroyuki Ogata,et al.  KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold , 2019, bioRxiv.

[25]  Michael E. Sparks,et al.  Differential Gene Expression in Red Imported Fire Ant (Solenopsis invicta) (Hymenoptera: Formicidae) Larval and Pupal Stages , 2018, Insects.

[26]  Anthony R. Borneman,et al.  Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies , 2018, BMC Bioinformatics.

[27]  S. Salzberg,et al.  CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise , 2018, Genome Biology.

[28]  S. Richards Arthropod Genome Sequencing and Assembly Strategies. , 2018, Methods in molecular biology.

[29]  F. Pereira,et al.  Comparative Analysis of the Adhesive Proteins of the Adult Stalked Goose Barnacle Pollicipes pollicipes (Cirripedia: Pedunculata) , 2018, Marine Biotechnology.

[30]  Dmitry Antipov,et al.  Versatile genome assembly evaluation with QUAST-LG , 2018, Bioinform..

[31]  Steven Salzberg,et al.  Removing contaminants from databases of draft genomes , 2018, PLoS Comput. Biol..

[32]  R. Burton,et al.  Genomic signatures of mitonuclear coevolution across populations of Tigriopus californicus , 2018, Nature Ecology & Evolution.

[33]  Chao Zhang,et al.  ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees , 2018, BMC Bioinformatics.

[34]  Juan Carlos Castilla-Rubio,et al.  Earth BioGenome Project: Sequencing life for the future of life , 2018, Proceedings of the National Academy of Sciences.

[35]  E. Susko,et al.  Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation , 2018, Systematic biology.

[36]  Zhihang Zhuo,et al.  Transcriptome analysis in different developmental stages of Batocera horsfieldi (Coleoptera: Cerambycidae) and comparison of candidate olfactory genes , 2018, PloS one.

[37]  Wouter De Coster,et al.  NanoPack: visualizing and processing long-read sequencing data , 2018, bioRxiv.

[38]  Han Fang,et al.  GenomeScope: Fast reference-free genome profiling from short reads , 2016, bioRxiv.

[39]  A. von Haeseler,et al.  UFBoot2: Improving the Ultrafast Bootstrap Approximation , 2017, bioRxiv.

[40]  A. Clare,et al.  Effects of culture conditions on larval growth and survival of stalked barnacles (Pollicipes pollicipes) , 2017 .

[41]  F. Arnaud,et al.  From core referencing to data re-use: two French national initiatives to reinforce paleodata stewardship (National Cyber Core Repository and LTER France Retro-Observatory) , 2017 .

[42]  Robert Lanfear,et al.  PartitionFinder 2: New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological Phylogenetic Analyses. , 2016, Molecular biology and evolution.

[43]  H. Oda,et al.  Divergence of structural strategies for homophilic E-cadherin binding among bilaterians , 2016, Journal of Cell Science.

[44]  K. Pruitt,et al.  P8008 The NCBI Eukaryotic Genome Annotation Pipeline , 2016 .

[45]  M. Schatz,et al.  Phased diploid genome assembly with single-molecule real-time sequencing , 2016, Nature Methods.

[46]  Paul J. McMurdie,et al.  DADA2: High resolution sample inference from Illumina amplicon data , 2016, Nature Methods.

[47]  Chao Bian,et al.  Draft genome of the Chinese mitten crab, Eriocheir sinensis , 2016, GigaScience.

[48]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[49]  Liliana Florea,et al.  Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads , 2015, GigaScience.

[50]  Hsiu-Chin Lin,et al.  The origins and evolution of dwarf males and habitat use in thoracican barnacles. , 2015, Molecular phylogenetics and evolution.

[51]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[52]  Brendan L. O’Connell,et al.  Chromosome-scale shotgun assembly using an in vitro method for long-range linkage , 2015, Genome research.

[53]  T. Shank,et al.  Evolutionary and biogeographical patterns of barnacles from deep‐sea hydrothermal vents , 2015, Molecular ecology.

[54]  S. Gelcich,et al.  Co-management in Europe: Insights from the gooseneck barnacle fishery in Asturias, Spain , 2014 .

[55]  M. Pérez‐Losada,et al.  Molecular phylogeny, systematics and morphological evolution of the acorn barnacles (Thoracica: Sessilia: Balanomorpha). , 2014, Molecular phylogenetics and evolution.

[56]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[57]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[58]  Stephen A. Smith,et al.  Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics , 2014, Molecular biology and evolution.

[59]  J. Høeg,et al.  On the Origin of a Novel Parasitic-Feeding Mode within Suspension-Feeding Barnacles , 2014, Current Biology.

[60]  Tetsuya Hayashi,et al.  Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads , 2014, Genome research.

[61]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[62]  S. Antunes,et al.  The Gooseneck Barnacle (Pollicipes pollicipes) as a Candidate Sentinel Species for Coastal Contamination , 2014, Archives of Environmental Contamination and Toxicology.

[63]  Matthew Fraser,et al.  InterProScan 5: genome-scale protein function classification , 2014, Bioinform..

[64]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[65]  Woojin Kim,et al.  Comparative Transcriptome Analysis of Queen, Worker, and Larva of Asian Honeybee, Apis cerana , 2013 .

[66]  Melissa J. Landrum,et al.  RefSeq: an update on mammalian reference sequences , 2013, Nucleic Acids Res..

[67]  Mark Howison,et al.  Agalma: an automated phylogenomics workflow , 2013, BMC Bioinformatics.

[68]  Colin N. Dewey,et al.  De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis , 2013, Nature Protocols.

[69]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[70]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[71]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[72]  Glenn Tesler,et al.  Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory , 2012, BMC Bioinformatics.

[73]  Richard M. Karp,et al.  Faster and More Accurate Sequence Alignment with SNAP , 2011, ArXiv.

[74]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[75]  Carl Kingsford,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..

[76]  M. Schultz,et al.  Economic impact of biofouling on a naval surface ship , 2011, Biofouling.

[77]  Peter J. Bickel,et al.  The Developmental Transcriptome of Drosophila melanogaster , 2010, Nature.

[78]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[79]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[80]  J. Deutsch Darwin and barnacles. , 2010, Comptes rendus biologies.

[81]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[82]  M. Pérez‐Losada,et al.  Remarkable convergent evolution in specialized parasitic Thecostraca (Crustacea) , 2009, BMC Biology.

[83]  Olivier Gascuel,et al.  Empirical profile mixture models for phylogenetic reconstruction , 2008, Bioinform..

[84]  Alejandro A. Schäffer,et al.  Database indexing for production MegaBLAST searches , 2008, Bioinform..

[85]  Alexander Souvorov,et al.  Splign: algorithms for computing spliced alignments with identification of paralogs , 2008, Biology Direct.

[86]  David Q. Matus,et al.  Broad phylogenomic sampling improves resolution of the animal tree of life , 2008, Nature.

[87]  Casey W. Dunn,et al.  Phyutility: a phyloinformatics tool for trees, alignments and molecular data , 2008, Bioinform..

[88]  Gerard Talavera,et al.  Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. , 2007, Systematic biology.

[89]  S. Meister,et al.  Life cycle transcriptome of the malaria mosquito Anopheles gambiae and comparison with the fruitfly Drosophila melanogaster , 2007, Proceedings of the National Academy of Sciences.

[90]  H. Philippe,et al.  Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model , 2007, BMC Evolutionary Biology.

[91]  K. Struhl Transcriptional noise and the fidelity of initiation by RNA polymerase II , 2007, Nature Structural &Molecular Biology.

[92]  Thomas Lengauer,et al.  Improved scoring of functional groups from gene expression data by decorrelating GO graph structure , 2006, Bioinform..

[93]  Burkhard Morgenstern,et al.  AUGUSTUS: ab initio prediction of alternative transcripts , 2006, Nucleic Acids Res..

[94]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[95]  Alejandro A. Schäffer,et al.  WindowMasker: window-based masker for sequenced genomes , 2006, Bioinform..

[96]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[97]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[98]  J. Boore,et al.  Phylogenetic position of the Pentastomida and (pan)crustacean relationships , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[99]  A. Love Darwin and Cirripedia Prior to 1846: Exploring the Origins of the Barnacle Research , 2002 .

[100]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[101]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[102]  Sandra Goldbeck-Wood,et al.  Trinity , 2000, The Lancet.

[103]  S. Hawkins,et al.  Larval development of the intertidal barnacles Chthamalus stellatus and Chthamalus montagui , 1999, Journal of the Marine Biological Association of the United Kingdom.

[104]  Ernst Haeckel,et al.  The Wonders of Life A Popular Study of Biological Philosophy , 1997, Nature.

[105]  O. M. Korn,et al.  Seasonal species composition and distribution of barnacle larvae in Avacha Inlet (Kamchatka) , 1995 .

[106]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[107]  H. Niiyama A COMPARATIVE STUDY OF THE CHROMOSOMES IN DECAPODS, ISOPODS AND AMPHIPODS, WITH SOME REMARKS ON CYTOTAXONOMY AND SEX-DETERMINATION IN THE CRUSTACEA , 1959 .

[108]  C. Darwin A Monograph on the Fossil Lepadidæ, or, Pedunculated Cirripedes of Great Britain , 1851, Monographs of the Palaeontographical Society.

[109]  G. Pertea,et al.  GFF Utilities: GffRead and GffCompare. , 2020, F1000Research.

[110]  Alejandra Perina Cedrón Analyses of molecular markers and gene expression in crustacean species , 2018 .

[111]  Jose V. Lopez,et al.  The Global Invertebrate Genomics Alliance (GIGA): Developing community resources to study diverse invertebrate genomes , 2014 .

[112]  T. Tatusova,et al.  Gnomon – NCBI eukaryotic gene prediction tool , 2010 .

[113]  W. Newman,et al.  PROSPECTUS ON LARVAL CIRRIPED SETATION FORMULAE, REVISITED , 2001 .

[114]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[115]  C. A. Lewis JUVENILE TO ADULT SHIFT IN FEEDING STRATEGIES IN THE PEDUNCULATE BARNACLE POLLICIPES POLYMERUS (SOWERBY) (CIRRIPEDIA, LEPADOMORPHA) , 1981 .

[116]  W. H. Lang Larval development of shallow water barnacles of the Carolinas (Cirripedia: Thoracica) with keys to naupliar stages , 1979 .

[117]  C. Darwin A Monograph on the Sub-Class Cirripedia , 1851 .