ArthropodaCyc: a CycADS powered collection of BioCyc databases to analyse and compare metabolism of arthropods

Abstract Arthropods interact with humans at different levels with highly beneficial roles (e.g. as pollinators), as well as with a negative impact for example as vectors of human or animal diseases, or as agricultural pests. Several arthropod genomes are available at present and many others will be sequenced in the near future in the context of the i5K initiative, offering opportunities for reconstructing, modelling and comparing their metabolic networks. In-depth analysis of these genomic data through metabolism reconstruction is expected to contribute to a better understanding of the biology of arthropods, thereby allowing the development of new strategies to control harmful species. In this context, we present here ArthropodaCyc, a dedicated BioCyc collection of databases using the Cyc annotation database system (CycADS), allowing researchers to perform reliable metabolism comparisons of fully sequenced arthropods genomes. Since the annotation quality is a key factor when performing such global genome comparisons, all proteins from the genomes included in the ArthropodaCyc database were re-annotated using several annotation tools and orthology information. All functional/domain annotation results and their sources were integrated in the databases for user access. Currently, ArthropodaCyc offers a centralized repository of metabolic pathways, protein sequence domains, Gene Ontology annotations as well as evolutionary information for 28 arthropod species. Such database collection allows metabolism analysis both with integrated tools and through extraction of data in formats suitable for systems biology studies. Database URL: http://arthropodacyc.cycadsys.org/

[1]  Liisa Holm,et al.  The Glanville fritillary genome retains an ancient karyotype and reveals selective chromosomal fusions in Lepidoptera , 2014, Nature Communications.

[2]  Steven J. M. Jones,et al.  Draft genome of the mountain pine beetle, Dendroctonus ponderosae Hopkins, a major forest pest , 2013, Genome Biology.

[3]  Peer Bork,et al.  The Genome of the Model Beetle and Pest Tribolium Castaneum Vertebrate-specific Orthologues Insect-specific Orthologues Homology Undetectable Similarity , 2022 .

[4]  Ludovic Cottret,et al.  CycADS: an annotation database system to ease the development and update of BioCyc databases , 2011, Database J. Biol. Databases Curation.

[5]  Anders Krogh,et al.  farming suggests key adaptations to advanced social life and fungus Acromyrmex echinatior The genome of the leaf-cutting ant Material Supplemental , 2011 .

[6]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[7]  Simon H. Martin,et al.  Butterfly genome reveals promiscuous exchange of mimicry adaptations among species , 2012, Nature.

[8]  Malcolm J. McConville,et al.  LeishCyc: a biochemical pathways database for Leishmania major , 2009, BMC Systems Biology.

[9]  Matthew Fraser,et al.  InterProScan 5: genome-scale protein function classification , 2014, Bioinform..

[10]  Inna Dubchak,et al.  The genome portal of the Department of Energy Joint Genome Institute: 2014 updates , 2013, Nucleic Acids Res..

[11]  Brian R. Johnson,et al.  The Genome Sequence of the Leaf-Cutter Ant Atta cephalotes Reveals Insights into Its Obligate Symbiotic Lifestyle , 2011, PLoS genetics.

[12]  Martin Hasselmann,et al.  The evolutionary dynamics of major regulators for sexual development among Hymenoptera species , 2015, Front. Genet..

[13]  Inna Dubchak,et al.  The Genome Portal of the Department of Energy Joint Genome Institute , 2011, Nucleic Acids Res..

[14]  Brian R. Johnson,et al.  Draft genome of the red harvester ant Pogonomyrmex barbatus , 2011, Proceedings of the National Academy of Sciences.

[15]  Akiyasu C. Yoshizawa,et al.  KAAS: an automatic genome annotation and pathway reconstruction server , 2007, Environmental health perspectives.

[16]  Douglas W. Yu,et al.  The draft genome of a socially polymorphic halictid bee, Lasioglossum albipes , 2013, Genome Biology.

[17]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[18]  Doina Caragea,et al.  BeetleBase in 2010: revisions to provide comprehensive genomic information for Tribolium castaneum , 2009, Nucleic Acids Res..

[19]  Sandra Gesing,et al.  VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases , 2014, Nucleic Acids Res..

[20]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[21]  S. Colella,et al.  Tyrosine pathway regulation is host-mediated in the pea aphid symbiosis during late embryonic and early larval development , 2013, BMC Genomics.

[22]  Joshua M. Stuart,et al.  Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. , 2009, The Journal of heredity.

[23]  Bo Wang,et al.  Obligate mutualism within a host drives the extreme specialization of a fig wasp genome , 2013, Genome Biology.

[24]  E. Lander Initial impact of the sequencing of the human genome , 2011, Nature.

[25]  Jun Wang,et al.  Spider genomes provide insight into composition and evolution of venom and silk , 2014, Nature Communications.

[26]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[27]  Todd H. Oakley,et al.  The Ecoresponsive Genome of Daphnia pulex , 2011, Science.

[28]  Peter D. Karp,et al.  A survey of metabolic databases emphasizing the MetaCyc family , 2011, Archives of Toxicology.

[29]  The UniProt Consortium,et al.  The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..

[30]  Erich Bornberg-Bauer,et al.  Functional and Evolutionary Insights from the Genomes of Three Parasitoid Nasonia Species , 2010, Science.

[31]  Shuai Zhan,et al.  MonarchBase: the monarch butterfly genome database , 2012, Nucleic Acids Res..

[32]  Stefan Götz,et al.  Blast2GO: A Comprehensive Suite for Functional Analysis in Plant Genomics , 2007, International journal of plant genomics.

[33]  Kazuei Mita,et al.  The genome of a lepidopteran model insect, the silkworm Bombyx mori. , 2009, Insect biochemistry and molecular biology.

[34]  Ewan Birney,et al.  Genomic information infrastructure after the deluge , 2010, Genome Biology.

[35]  Jian Wang,et al.  The Genome Sequence of the Malaria Mosquito Anopheles gambiae , 2002, Science.

[36]  Peter D. Karp,et al.  Querying and computing with BioCyc databases , 2005, Bioinform..

[37]  Michael P. Barrett,et al.  MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks , 2010, Nucleic Acids Res..

[38]  Jian Wang,et al.  A heterozygous moth genome provides insights into herbivory and detoxification , 2013, Nature Genetics.

[39]  Evgeny M. Zdobnov,et al.  Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle , 2010, Proceedings of the National Academy of Sciences.

[40]  Susan J. Brown,et al.  The i5K Initiative: advancing arthropod genomics for knowledge, human health, agriculture, and the environment. , 2013, The Journal of heredity.

[41]  Dawei Li,et al.  A Draft Sequence for the Genome of the Domesticated Silkworm ( Bombyx mori ) , 2004 .

[42]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[43]  Peter D. Karp,et al.  An advanced web query interface for biological databases , 2010, Database J. Biol. Databases Curation.

[44]  Peter D. Karp,et al.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology , 2015, Briefings Bioinform..

[45]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[46]  Peter D. Karp,et al.  The EcoCyc Database , 2002, Nucleic Acids Res..

[47]  Karyn Megy,et al.  Comparative genomics allows the discovery of cis-regulatory elements in mosquitoes , 2009, Proceedings of the National Academy of Sciences.

[48]  S. Rhee,et al.  AraCyc: A Biochemical Pathway Database for Arabidopsis1 , 2003, Plant Physiology.

[49]  Shuai Zhan,et al.  The Monarch Butterfly Genome Yields Insights into Long-Distance Migration , 2011, Cell.

[50]  Evgeny M. Zdobnov,et al.  Genome Sequence of Aedes aegypti, a Major Arbovirus Vector , 2007, Science.

[51]  C. Claudel-Renard,et al.  Enzyme-specific profiles for genome annotation: PRIAM. , 2003, Nucleic acids research.

[52]  Fabrice Legeai,et al.  AphidBase: a database for aphid genomic resources , 2007, Bioinform..

[53]  Juan Miguel García-Gómez,et al.  BIOINFORMATICS APPLICATIONS NOTE Sequence analysis Manipulation of FASTQ data with Galaxy , 2005 .

[54]  Stefan R. Henz,et al.  The genome of Tetranychus urticae reveals herbivorous pest adaptations , 2011, Nature.

[55]  Salvador Capella-Gutiérrez,et al.  PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome , 2013, Nucleic Acids Res..

[56]  Peter D. Karp,et al.  Web-based metabolic network visualization with a zooming user interface , 2011, BMC Bioinformatics.

[57]  Geoffrey H. Siwo,et al.  Genome Sequence of the Tsetse Fly (Glossina morsitans): Vector of African Trypanosomiasis , 2014, Science.

[58]  Jian Wang,et al.  SilkDB: a knowledgebase for silkworm biology and genomics , 2004, Nucleic Acids Res..

[59]  Susan J. Brown,et al.  Creating a buzz about insect genomes. , 2011, Science.

[60]  Andrew M. Jenkinson,et al.  Ensembl 2009 , 2008, Nucleic Acids Res..

[61]  Leszek P. Pryszcz,et al.  MetaPhOrs: orthology and paralogy predictions from multiple phylogenetic evidence using a consistency-based confidence score , 2010, Nucleic acids research.

[62]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[63]  L. Keller,et al.  The genome of the fire ant Solenopsis invicta , 2011, Proceedings of the National Academy of Sciences.

[64]  Hitoshi Ohtaki Community of Scientists , 2003 .

[65]  Peter D. Karp,et al.  EcoCyc: a comprehensive database of Escherichia coli biology , 2010, Nucleic Acids Res..

[66]  Claudia M. A. Carareto,et al.  Genome of Rhodnius prolixus, an insect vector of Chagas disease, reveals unique adaptations to hematophagy and parasite infection , 2015, Proceedings of the National Academy of Sciences.

[67]  Peter D. Karp,et al.  The MetaCyc Database , 2002, Nucleic Acids Res..

[68]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[69]  Claire Fraser-Liggett,et al.  Sequencing of Culex quinquefasciatus Establishes a Platform for Mosquito Comparative Genomics , 2010, Science.

[70]  Christine G. Elsik,et al.  Hymenoptera Genome Database: integrated community resources for insect species of the order Hymenoptera , 2010, Nucleic Acids Res..

[71]  P. Karp,et al.  Computational prediction of human metabolic pathways from the complete human genome , 2004, Genome Biology.

[72]  G. K. Davis,et al.  Genome Sequence of the Pea Aphid Acyrthosiphon pisum , 2010, PLoS biology.

[73]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[74]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[75]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[76]  The Honeybee Genome Sequencing Consortium,et al.  Erratum: Insights into social insects from the genome of the honeybee Apis mellifera , 2006, Nature.

[77]  Yves Van de Peer,et al.  ORCAE: online resource for community annotation of eukaryotes , 2012, Nature Methods.

[78]  Jim Thurmond,et al.  FlyBase 101 – the basics of navigating FlyBase , 2011, Nucleic Acids Res..

[79]  Thomas K. F. Wong,et al.  Phylogenomics resolves the timing and pattern of insect evolution , 2014, Science.

[80]  Jun Wang,et al.  Genomic Comparison of the Ants Camponotus floridanus and Harpegnathos saltator , 2010, Science.

[81]  Rainer Breitling,et al.  TrypanoCyc: a community-led biochemical pathways database for Trypanosoma brucei , 2014, Nucleic Acids Res..