COFECO: composite function annotation enriched by protein complex data

COFECO is a web-based tool for a composite annotation of protein complexes, KEGG pathways and Gene Ontology (GO) terms within a class of genes and their orthologs under study. Widely used functional enrichment tools using GO and KEGG pathways create large list of annotations that make it difficult to derive consolidated information and often include over-generalized terms. The interrelationship of annotation terms can be more clearly delineated by integrating the information of physically interacting proteins with biological pathways and GO terms. COFECO has the following advanced characteristics: (i) The composite annotation sets of correlated functions and cellular processes for a given gene set can be identified in a more comprehensive and specified way by the employment of protein complex data together with GO and KEGG pathways as annotation resources. (ii) Orthology based integrative annotations among different species complement the defective annotations in an individual genome and provide the information of evolutionary conserved correlations. (iii) A term filtering feature enables users to collect the specified annotations enriched with selected function terms. (iv) A cross-comparison of annotation results between two different datasets is possible. In addition, COFECO provides a web-based GO hierarchical viewer and KEGG pathway viewer where the enrichment results can be summarized and further explored. COFECO is freely accessible at http://piech.kaist.ac.kr/cofeco.

[1]  Sang-Bae Kim,et al.  ADGO: analysis of differentially expressed gene sets using composite GO annotation , 2006, Bioinform..

[2]  G. Chan,et al.  Human Bubr1 Is a Mitotic Checkpoint Kinase That Monitors Cenp-E Functions at Kinetochores and Binds the Cyclosome/APC , 1999, The Journal of cell biology.

[3]  M. Kimmel,et al.  Conflict of interest statement. None declared. , 2010 .

[4]  Qi Zheng,et al.  GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis , 2008, Nucleic Acids Res..

[5]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[6]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[7]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Arun K. Ramani,et al.  How complete are current yeast and human protein-interaction networks? , 2006, Genome Biology.

[10]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[11]  Paul Tempst,et al.  PINdb: a database of nuclear protein complexes from human and yeast , 2004, Bioinform..

[12]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[13]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[16]  D. Beach,et al.  Proliferating cell nuclear antigen and p21 are components of multiple cell cycle kinase complexes. , 1993, Molecular biology of the cell.

[17]  Tie Koide,et al.  BayGO: Bayesian analysis of ontology term enrichment in microarray data , 2006, BMC Bioinformatics.

[18]  A. Orth,et al.  Large-scale analysis of the human and mouse transcriptomes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[20]  Joaquín Dopazo,et al.  The role of the environment in Parkinson's disease. , 1996, Nucleic Acids Res..

[21]  Erik L. L. Sonnhammer,et al.  InParanoid 6: eukaryotic ortholog clusters with inparalogs , 2007, Nucleic Acids Res..

[22]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[23]  Martin Vingron,et al.  Ontologizer 2.0 - a multifunctional tool for GO term enrichment analysis and data exploration , 2008, Bioinform..

[24]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[25]  Antoine M. van Oijen,et al.  Real-time single-molecule observation of rolling-circle DNA replication , 2009, Nucleic acids research.

[26]  J. Carazo,et al.  GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists , 2007, Genome Biology.

[27]  Bing Zhang,et al.  GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies , 2004, BMC Bioinformatics.

[28]  Thorsten Schmidt,et al.  ProfCom: a web tool for profiling the complex functionality of gene groups identified from high-throughput data , 2008, Nucleic Acids Res..

[29]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.