Ensembl Genomes: Extending Ensembl across the taxonomic space

Ensembl Genomes (http://www.ensemblgenomes.org) is a new portal offering integrated access to genome-scale data from non-vertebrate species of scientific interest, developed using the Ensembl genome annotation and visualisation platform. Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. Many of the databases supporting the portal have been built in close collaboration with the scientific community, which we consider as essential for maintaining the accuracy and usefulness of the resource. A common set of user interfaces (which include a graphical genome browser, FTP, BLAST search, a query optimised data warehouse, programmatic access, and a Perl API) is provided for all domains. Data types incorporated include annotation of (protein and non-protein coding) genes, cross references to external resources, and high throughput experimental data (e.g. data from large scale studies of gene expression and polymorphism visualised in their genomic context). Additionally, extensive comparative analysis has been performed, both within defined clades and across the wider taxonomy, and sequence alignments and gene trees resulting from this can be accessed through the site.

[1]  Eileen Kraemer,et al.  PlasmoDB: a functional genomic database for malaria parasites , 2008, Nucleic Acids Res..

[2]  E. Birney,et al.  Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. , 2008, Genome research.

[3]  Gregory R. Madey,et al.  VectorBase: a data resource for invertebrate vector genomics , 2008, Nucleic Acids Res..

[4]  Lincoln Stein,et al.  Gramene: a growing plant comparative genomics resource , 2007, Nucleic Acids Res..

[5]  Leopold Parts,et al.  Population genomics of domestic and wild yeasts , 2008 .

[6]  Kimberly Van Auken,et al.  WormBase: new content and better access , 2006, Nucleic Acids Res..

[7]  Richard M. Clark,et al.  Common Sequence Polymorphisms Shaping Genetic Diversity in Arabidopsis thaliana , 2007, Science.

[8]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[9]  Julio Collado-Vides,et al.  RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation , 2007, Nucleic Acids Res..

[10]  Andrew M. Jenkinson,et al.  Ensembl 2009 , 2008, Nucleic Acids Res..

[11]  Ting Wang,et al.  The UCSC Genome Browser Database: update 2009 , 2008, Nucleic Acids Res..

[12]  Ibrahim Emam,et al.  ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression , 2008, Nucleic Acids Res..

[13]  R. Mott,et al.  The 1001 Genomes Project for Arabidopsis thaliana , 2009, Genome Biology.

[14]  Terri K. Attwood,et al.  Aspergillus Genomes and the Aspergillus Cloud , 2008, Nucleic Acids Res..

[15]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[16]  Matthew Berriman,et al.  GeneDB: a resource for prokaryotic and eukaryotic organisms , 2004, Nucleic Acids Res..

[17]  Hagen Blankenburg,et al.  Integrating biological data – the Distributed Annotation System , 2008, BMC Bioinformatics.

[18]  R. Drysdale FlyBase : a database for the Drosophila research community. , 2008, Methods in molecular biology.

[19]  Kara Dolinski,et al.  Gene Ontology annotations at SGD: new data sources and annotation methods , 2007, Nucleic Acids Res..

[20]  Lennart Martens,et al.  PRIDE: new developments and new datasets , 2007, Nucleic Acids Res..

[21]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[22]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[23]  Albert J. Vilella,et al.  EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. , 2009, Genome research.

[24]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[25]  Rodrigo Lopez,et al.  Petabyte-scale innovations at the European Nucleotide Archive , 2008, Nucleic Acids Res..

[26]  The UniProt Consortium,et al.  The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..

[27]  Robert P. Davey,et al.  Population genomics of domestic and wild yeasts , 2008, Nature.