The i5k Workspace@NAL—enabling genomic data access, visualization and curation of arthropod genomes

The 5000 arthropod genomes initiative (i5k) has tasked itself with coordinating the sequencing of 5000 insect or related arthropod genomes. The resulting influx of data, mostly from small research groups or communities with little bioinformatics experience, will require visualization, dissemination and curation, preferably from a centralized platform. The National Agricultural Library (NAL) has implemented the i5k Workspace@NAL (http://i5k.nal.usda.gov/) to help meet the i5k initiative's genome hosting needs. Any i5k member is encouraged to contact the i5k Workspace with their genome project details. Once submitted, new content will be accessible via organism pages, genome browsers and BLAST search engines, which are implemented via the open-source Tripal framework, a web interface for the underlying Chado database schema. We also implement the Web Apollo software for groups that choose to curate gene models. New content will add to the existing body of 35 arthropod species, which include species relevant for many aspects of arthropod genomic research, including agriculture, invasion biology, systematics, ecology and evolution, and developmental research.

[1]  Erich Bornberg-Bauer,et al.  Social insect genomes exhibit dramatic evolution in gene composition and regulation while preserving regulatory features linked to sociality , 2013, Genome research.

[2]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[3]  F. C. Kafatos,et al.  Widespread Divergence Between Incipient Anopheles gambiae Species Revealed by Whole Genome Sequences , 2010, Science.

[4]  Susan J. Brown,et al.  Creating a buzz about insect genomes. , 2011, Science.

[5]  Ying Wang,et al.  Insights into social insects from the genome of the honeybee Apis mellifera , 2006, Nature.

[6]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[7]  Christine G Elsik,et al.  Community annotation: procedures, protocols, and supporting tools. , 2006, Genome research.

[8]  J. Losey,et al.  The Economic Value of Ecological Services Provided by Insects , 2006 .

[9]  Stephen P. Ficklin,et al.  Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases , 2013, Database J. Biol. Databases Curation.

[10]  D. Pimentel,et al.  Environmental and Economic Costs of Nonindigenous Species in the United States , 2000 .

[11]  L. P. Lounibos,et al.  Invasions by insect vectors of human disease. , 2002, Annual review of entomology.

[12]  Peer Bork,et al.  Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster , 2002, Science.

[13]  Monica C Munoz-Torres,et al.  Web Apollo: a web-based genomic annotation editing platform , 2013, Genome Biology.

[14]  F. C. Kafatos,et al.  SNP Genotyping Defines Complex Gene-Flow Boundaries Among African Malaria Vector Mosquitoes , 2010, Science.

[15]  Susan J. Brown,et al.  The i5K Initiative: advancing arthropod genomics for knowledge, human health, agriculture, and the environment. , 2013, The Journal of heredity.

[16]  Y. P. Chen,et al.  Immune pathways and defence mechanisms in honey bees Apis mellifera , 2006, Insect molecular biology.

[17]  Chris Mungall,et al.  A Chado case study: an ontology-based modular schema for representing genome-associated biological information , 2007, ISMB/ECCB.

[18]  L. Stein,et al.  JBrowse: a next-generation genome browser. , 2009, Genome research.

[19]  Melanie A. Huntley,et al.  Evolution of genes and genomes on the Drosophila phylogeny , 2007, Nature.

[20]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[21]  Stefan Wyder,et al.  Quantification of ortholog losses in insects and vertebrates , 2007, Genome Biology.