GoMapMan: integration, consolidation and visualization of plant gene annotations within the MapMan ontology

GoMapMan (http://www.gomapman.org) is an open web-accessible resource for gene functional annotations in the plant sciences. It was developed to facilitate improvement, consolidation and visualization of gene annotations across several plant species. GoMapMan is based on the MapMan ontology, organized in the form of a hierarchical tree of biological concepts, which describe gene functions. Currently, genes of the model species Arabidopsis and three crop species (potato, tomato and rice) are included. The main features of GoMapMan are (i) dynamic and interactive gene product annotation through various curation options; (ii) consolidation of gene annotations for different plant species through the integration of orthologue group information; (iii) traceability of gene ontology changes and annotations; (iv) integration of external knowledge about genes from different public resources; and (v) providing gathered information to high-throughput analysis tools via dynamically generated export files. All of the GoMapMan functionalities are openly available, with the restriction on the curation functions, which require prior registration to ensure traceability of the implemented changes.

[1]  Nada Lavrac,et al.  SEGS: Search for enriched gene sets in microarray data , 2008, J. Biomed. Informatics.

[2]  Hannu Toivonen,et al.  Biomine: predicting links between biological entities using network models of heterogeneous databases , 2012, BMC Bioinformatics.

[3]  Céline Rouveirol,et al.  Towards a semi-automatic functional annotation tool based on decision-tree techniques , 2008, BMC proceedings.

[4]  Stephen C. Ekker,et al.  Mojo Hand, a TALEN design tool for genome editing applications , 2013, BMC Bioinformatics.

[5]  Damian Szklarczyk,et al.  eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges , 2011, Nucleic Acids Res..

[6]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[7]  M. Fay,et al.  Why size really matters when sequencing plant genomes , 2012 .

[8]  Andrew E. Teschendorff,et al.  DART: Denoising Algorithm based on Relevance network Topology improves molecular pathway activity inference , 2011, BMC Bioinformatics.

[9]  Uwe Scholz,et al.  Genes driving potato tuber initiation and growth: identification based on transcriptional changes using the POCI array , 2008, Functional & Integrative Genomics.

[10]  Dennis P Wall,et al.  Ortholog detection using the reciprocal smallest distance algorithm. , 2007, Methods in molecular biology.

[11]  Mark A. Ragan,et al.  Clustering evolving proteins into homologous families , 2013, BMC Bioinformatics.

[12]  Nada Lavrac,et al.  SegMine workflows for semantic microarray data analysis in Orange4WS , 2011, BMC Bioinformatics.

[13]  Prudence Mutowo-Meullenet,et al.  Use of Gene Ontology Annotation to understand the peroxisome proteome in humans , 2013, Database J. Biol. Databases Curation.

[14]  Y. van de Peer,et al.  PLAZA: A Comparative Genomics Resource to Study Gene and Genome Evolution in Plants[W] , 2009, The Plant Cell Online.

[15]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[16]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[17]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[18]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[19]  Kuan Yang,et al.  Performance comparison of gene family clustering methods with expert curated gene family data set in Arabidopsis thaliana , 2008, Planta.

[20]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Vladimir Batagelj,et al.  Pajek - Program for Large Network Analysis , 1999 .

[22]  David M. A. Martin,et al.  Genome sequence and analysis of the tuber crop potato , 2011, Nature.

[23]  D. Brummell,et al.  Methods for transient assay of gene function in floral tissues , 2007, Plant Methods.

[24]  S. Rhee,et al.  MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. , 2004, The Plant journal : for cell and molecular biology.

[25]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[26]  Zoran Nikoloski,et al.  The Choice between MapMan and Gene Ontology for Automated Gene Function Prediction in Plant Science , 2012, Front. Gene..

[27]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[28]  Zhentian Lei,et al.  Transcript and proteomic analysis of developing white lupin (Lupinus albus L.) roots , 2009, BMC Plant Biology.

[29]  A. Bader miR-34 – a microRNA replacement therapy is headed to the clinic , 2012, Front. Gene..

[30]  A. Rotter,et al.  Adaptation of the MapMan ontology to biotic stress responses: application in solanaceous species , 2007, Plant Methods.

[31]  Rebecca F. Halperin,et al.  GuiTope: an application for mapping random-sequence peptides to protein sequences , 2012, BMC Bioinformatics.

[32]  Christophe Dessimoz,et al.  The what, where, how and why of gene ontology—a primer for bioinformaticians , 2011, Briefings Bioinform..

[33]  Juha Merilä,et al.  A first-generation microsatellite-based genetic linkage map of the Siberian jay (Perisoreus infaustus): insights into avian genome evolution , 2009, BMC Genomics.

[34]  Qinmin Hu,et al.  A robust approach to optimizing multi-source information for enhancing genomics retrieval performance , 2011, BMC Bioinformatics.

[35]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[36]  Christophe Dessimoz,et al.  Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods , 2009, PLoS Comput. Biol..

[37]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[38]  C. Kole,et al.  Arabidopsis Genome Initiative , 2016 .

[39]  Damian Smedley,et al.  BioMart Central Portal: an open database network for the biological community , 2011, Database J. Biol. Databases Curation.

[40]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[41]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[42]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[43]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[44]  Mark Stitt,et al.  Gene expression profiling in susceptible interaction of grapevine with its fungal pathogen Eutypa lata: Extending MapMan ontology for grapevine , 2009, BMC Plant Biology.

[45]  Carlo Laudanna,et al.  HOMECAT: consensus homologs mapping for interspecific knowledge transfer and functional genomic data integration , 2013, Bioinform..

[46]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[47]  Daniel Lee,et al.  The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species , 2001, Nucleic Acids Res..

[48]  C. Orengo,et al.  Protein function prediction--the power of multiplicity. , 2009, Trends in biotechnology.

[49]  Daniel W. A. Buchan,et al.  The tomato genome sequence provides insights into fleshy fruit evolution , 2012, Nature.

[50]  Mark Stitt,et al.  A guide to using MapMan to visualize and compare Omics data in plants: a case study in the crop species, Maize. , 2009, Plant, cell & environment.