Phytozome: a comparative platform for green plant genomics

The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

[1]  David M. A. Martin,et al.  Genome sequence and analysis of the tuber crop potato , 2011, Nature.

[2]  Paramvir S. Dehal,et al.  A phylogenomic gene cluster resource: the Phylogenetically Inferred Groups (PhIGs) database , 2006, BMC Bioinformatics.

[3]  Christophe Périn,et al.  GreenPhylDB: a database for plant comparative genomics , 2007, Nucleic Acids Res..

[4]  Christian E. V. Storm,et al.  Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. , 2001, Journal of molecular biology.

[5]  Y. van de Peer,et al.  PLAZA: A Comparative Genomics Resource to Study Gene and Genome Evolution in Plants[W] , 2009, The Plant Cell Online.

[6]  Mihaela M. Martis,et al.  The Sorghum bicolor genome and the diversification of grasses , 2009, Nature.

[7]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[8]  Matthew D. Rasmussen,et al.  Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes. , 2007, Genome research.

[9]  Gloria M. Coruzzi,et al.  OrthologID: automation of genome-scale ortholog identification within a parsimony framework , 2006, Bioinform..

[10]  Todd J. Vision,et al.  Phytome: a platform for plant comparative genomics , 2005, Nucleic Acids Res..

[11]  David M. Grant,et al.  The Legume Information System (LIS): an integrated information resource for comparative legume biology , 2004, Nucleic Acids Res..

[12]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[13]  Lincoln Stein,et al.  Gramene: a growing plant comparative genomics resource , 2007, Nucleic Acids Res..

[14]  C. Burge,et al.  Computational inference of homologous gene structures in the human genome. , 2001, Genome research.

[15]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[16]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[17]  Jodie J. Yin,et al.  A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes , 2004, Genome Biology.

[18]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[19]  Roger E Bumgarner,et al.  The genome of the domesticated apple (Malus × domestica Borkh.) , 2010, Nature Genetics.

[20]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[21]  Dorrie Main,et al.  GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data , 2007, Nucleic Acids Res..

[22]  Michael S. Barker,et al.  The Selaginella Genome Identifies Genetic Changes Associated with the Evolution of Vascular Plants , 2011, Science.

[23]  John A. Hamilton,et al.  The TIGR Rice Genome Annotation Resource: improvements and new features , 2006, Nucleic Acids Res..

[24]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[25]  S. Ohno,et al.  Evolution from fish to mammals by gene duplication. , 2009, Hereditas.

[26]  J. Jurka,et al.  Genomic Analysis of Organismal Complexity in the Multicellular Green Alga Volvox carteri , 2010, Science.

[27]  Sean R Eddy,et al.  A new generation of homology search tools based on probabilistic inference. , 2009, Genome informatics. International Conference on Genome Informatics.

[28]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[29]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[30]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[31]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[32]  Sai Guna Ranjan Gurazada,et al.  Genome sequencing and analysis of the model grass Brachypodium distachyon , 2010, Nature.

[33]  Lior Pachter,et al.  VISTA: computational tools for comparative genomics , 2004, Nucleic Acids Res..

[34]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[35]  Manuel A. S. Santos,et al.  Evolution of pathogenicity and sexual reproduction in eight Candida genomes , 2009, Nature.

[36]  Tao Liu,et al.  TreeFam: 2008 Update , 2007, Nucleic Acids Res..

[37]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[38]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[39]  Robert M. Buels,et al.  The Sol Genomics Network (solgenomics.net): growing tomatoes using Perl , 2010, Nucleic Acids Res..

[40]  Sara L. Zimmer,et al.  The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions , 2007, Science.

[41]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[42]  Oleg Simakov,et al.  Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Title Genomic analysis of organismal complexity in the multicellular green alga , 2010 .

[43]  Matthew D. Wilkerson,et al.  PlantGDB: a resource for comparative plant genomics , 2007, Nucleic Acids Res..

[44]  Nan Guo,et al.  PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways , 2006, Nucleic Acids Res..

[45]  M. Gribskov,et al.  The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray) , 2006, Science.

[46]  J. Bennetzen,et al.  The Physcomitrella Genome Reveals Evolutionary Insights into the Conquest of Land by Plants , 2008, Science.

[47]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): gene structure and function annotation , 2007, Nucleic Acids Res..

[48]  Kai F. Müller,et al.  PlantTribes: a gene and gene family resource for comparative genomics in plants , 2007, Nucleic Acids Res..

[49]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[50]  Richard M. Clark,et al.  The Arabidopsis lyrata genome sequence and the basis of rapid genome size change , 2011, Nature Genetics.

[51]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[52]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[53]  M. Lynch The frailty of adaptive hypotheses for the origins of organismal complexity , 2007, Proceedings of the National Academy of Sciences.

[54]  Henry D. Priest,et al.  The genome of woodland strawberry (Fragaria vesca) , 2011, Nature Genetics.

[55]  Chris Sander,et al.  MView: a web-compatible database search or multiple alignment viewer , 1998, Bioinform..

[56]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[57]  Stephen M. Mount,et al.  The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) , 2008, Nature.