Ensembl Genomes 2016: more genomes, more complexity

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.

[1]  D. Haussler,et al.  Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  L. Keller,et al.  The genome of the fire ant Solenopsis invicta , 2011, Proceedings of the National Academy of Sciences.

[3]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[4]  J. Batley,et al.  A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome , 2014, Science.

[5]  Kun Lu,et al.  The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes , 2014, Nature Communications.

[6]  J. Poulain,et al.  The genome of Theobroma cacao , 2011, Nature Genetics.

[7]  G. Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[8]  Kimberly Van Auken,et al.  WormBase 2014: new views of curated biology , 2013, Nucleic Acids Res..

[9]  Robert S. Harris,et al.  Improved pairwise alignment of genomic dna , 2007 .

[10]  Pietro Liò,et al.  The BioMart community portal: an innovative alternative to large, centralized data repositories , 2015, Nucleic Acids Res..

[11]  Mihaela M. Martis,et al.  A physical, genetic and functional sequence assembly of the barley genome. , 2022 .

[12]  Nicholas H. Putnam,et al.  The Genome of the Ctenophore Mnemiopsis leidyi and Its Implications for Cell Type Evolution , 2013, Science.

[13]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[14]  Midori A. Harris,et al.  Canto: an online tool for community literature curation , 2014, Bioinform..

[15]  Paul D. Shaw,et al.  Natural variation in a homolog of Antirrhinum CENTRORADIALIS contributed to spring growth habit and environmental adaptation in cultivated barley , 2012, Nature Genetics.

[16]  Neil Hall,et al.  A haplotype map of allohexaploid wheat reveals distinct patterns of selection on homoeologous genomes , 2015, Genome Biology.

[17]  Rod A Wing,et al.  The International Oryza Map Alignment Project: development of a genus-wide comparative genomics platform to help solve the 9 billion-people question. , 2013, Current opinion in plant biology.

[18]  Albert J. Vilella,et al.  EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. , 2009, Genome research.

[19]  Guy Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[20]  P. Kersey,et al.  Analysis of the bread wheat genome using whole genome shotgun sequencing , 2012, Nature.

[21]  Rebecca F. Halperin,et al.  GuiTope: an application for mapping random-sequence peptides to protein sequences , 2012, BMC Bioinformatics.

[22]  K. Yelick,et al.  A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome , 2015, Genome Biology.

[23]  Huaiyu Mi,et al.  The InterPro protein families database: the classification resource after 15 years , 2014, Nucleic Acids Res..

[24]  Liisa Holm,et al.  The Glanville fritillary genome retains an ancient karyotype and reveals selective chromosomal fusions in Lepidoptera , 2014, Nature Communications.

[25]  Monica C Munoz-Torres,et al.  Web Apollo: a web-based genomic annotation editing platform , 2013, Genome Biology.

[26]  Richard Gibson,et al.  Content discovery and retrieval services at the European Nucleotide Archive , 2014, Nucleic Acids Res..

[27]  Jun Wang,et al.  Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing. , 2014, The Plant journal : for cell and molecular biology.

[28]  Dan M. Bolser,et al.  Ensembl Genomes 2013: scaling up access to genome-wide data , 2013, Nucleic Acids Res..

[29]  Rashmi Pant,et al.  The Pathogen-Host Interactions database (PHI-base): additions and future developments , 2014, Nucleic Acids Res..

[30]  Ting Wang,et al.  Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser , 2013, Bioinform..

[31]  J. Chapman,et al.  Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ) , 2013, The Plant journal : for cell and molecular biology.

[32]  Jürg Bähler,et al.  PomBase 2015: updates to the fission yeast database , 2014, Nucleic Acids Res..

[33]  Steven J. M. Jones,et al.  Draft genome of the mountain pine beetle, Dendroctonus ponderosae Hopkins, a major forest pest , 2013, Genome Biology.

[34]  David K. Gifford,et al.  High-resolution genetic mapping with pooled sequencing , 2012, BMC Bioinformatics.

[35]  Amborella Genome The Amborella Genome and the Evolution of Flowering Plants , 2013, Science.

[36]  Sandra Gesing,et al.  VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases , 2014, Nucleic Acids Res..

[37]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[38]  Dan M. Bolser,et al.  Gramene 2013: comparative plant genomics resources , 2013, Nucleic Acids Res..

[39]  Keith J. Edwards,et al.  CerealsDB 2.0: an integrated resource for plant breeders and scientists , 2012, BMC Bioinformatics.

[40]  Hongyu Zhao,et al.  A Multipurpose, High-Throughput Single-Nucleotide Polymorphism Chip for the Dengue and Yellow Fever Mosquito, Aedes aegypti , 2015, G3: Genes, Genomes, Genetics.

[41]  Hadi Quesneville,et al.  Structural and functional partitioning of bread wheat chromosome 3B , 2014, Science.

[42]  Jun Wang,et al.  Molecular traces of alternative social organization in a termite genome , 2014, Nature Communications.