Algal Functional Annotation Tool: a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data

BackgroundProgress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data.DescriptionThe Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG pathway maps and batch gene identifier conversion.ConclusionsThe Algal Functional Annotation Tool aims to provide an integrated data-mining environment for algal genomics by combining data from multiple annotation databases into a centralized tool. This site is designed to expedite the process of functional annotation and the interpretation of gene lists, such as those derived from high-throughput RNA-seq experiments. The tool is publicly available at http://pathways.mcdb.ucla.edu.

[1]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[2]  J. W. Peters,et al.  Engineering algae for biohydrogen and biofuel production. , 2009, Current opinion in biotechnology.

[3]  C. Howe,et al.  Biodiesel from algae: challenges and prospects. , 2010, Current opinion in biotechnology.

[4]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[5]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[6]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[7]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[8]  Jean-Michel Claverie,et al.  The Chlorella variabilis NC64A Genome Reveals Adaptation to Photosymbiosis, Coevolution with Viruses, and Cryptic Sex[C][W] , 2010, Plant Cell.

[9]  A. Salamov,et al.  Green Evolution and Dynamic Adaptations Revealed by Genomes of the Marine Picoeukaryotes Micromonas , 2009, Science.

[10]  W. Marshall,et al.  Basal bodies platforms for building cilia. , 2008, Current topics in developmental biology.

[11]  A. Hemschemeier,et al.  Analytical approaches to photobiological hydrogen production in unicellular green algae , 2009, Photosynthesis Research.

[12]  Jonathan M. Scholey,et al.  Intraflagellar Transport and Cilium-Based Signaling , 2006, Cell.

[13]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  Jens Rupprecht,et al.  From systems biology to fuel--Chlamydomonas reinhardtii as a model for a systems biology approach to improve biohydrogen production. , 2009, Journal of biotechnology.

[16]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[17]  Wallace F. Marshall,et al.  Chapter 1 Basal Bodies , 2008 .

[18]  BMC Bioinformatics , 2005 .

[19]  Robert E. Jinkerson,et al.  Genetic Engineering of Algae for Enhanced Biofuel Production , 2010, Eukaryotic Cell.

[20]  Hervé Moreau,et al.  Genomic insights into photosynthesis in eukaryotic phytoplankton. , 2010, Trends in plant science.

[21]  M. Ghirardi,et al.  Photobiological hydrogen-producing systems. , 2009, Chemical Society reviews.

[22]  Shin-Han Shiu,et al.  Changes in Transcript Abundance in Chlamydomonas reinhardtii following Nitrogen Deprivation Predict Diversion of Metabolism1[W][OA] , 2010, Plant Physiology.

[23]  W. J. V. Osterhout,et al.  ON THE DYNAMICS OF PHOTOSYNTHESIS , 1918, The Journal of general physiology.

[24]  Nicholas H. Putnam,et al.  The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation , 2007, Proceedings of the National Academy of Sciences.

[25]  Sara L. Zimmer,et al.  The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions , 2007, Science.

[26]  E. H. Harris,et al.  CHLAMYDOMONAS AS A MODEL ORGANISM. , 2003, Annual review of plant physiology and plant molecular biology.

[27]  Matteo Pellegrini,et al.  RNA-Seq Analysis of Sulfur-Deprived Chlamydomonas Cells Reveals Aspects of Acclimation Critical for Cell Survival[W] , 2010, Plant Cell.

[28]  Olaf Kruse,et al.  Microalgal hydrogen production. , 2010, Current opinion in biotechnology.

[29]  J. Rochaix,et al.  Genetics of the Biogenesis and Dynamics of the Photosynthetic Machinery in Eukaryotes , 2004, The Plant Cell Online.

[30]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[31]  S. Rhee,et al.  MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. , 2004, The Plant journal : for cell and molecular biology.

[32]  Simon Prochnik,et al.  Novel metabolism in Chlamydomonas through the lens of genomics. , 2007, Current opinion in plant biology.

[33]  B. De Baets,et al.  Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features. , 2006, Proceedings of the National Academy of Sciences of the United States of America.