GOTax: investigating biological processes and biochemical activities along the taxonomic tree

We describe GOTax, a comparative genomics platform that integrates protein annotation with protein family classification and taxonomy. User-defined sets of proteins, protein families, annotation terms or taxonomic groups can be selected and compared, allowing for the analysis of distribution of biological processes and molecular activities over different taxonomic groups. In particular, a measure of functional similarity is available for comparing proteins and protein families, establishing functional relationships independent of evolution.

[1]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[2]  Rolf Apweiler,et al.  The EBI SRS Server: Recent Developments , 2002, German Conference on Bioinformatics.

[3]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[4]  Gary L Gilliland,et al.  Crystal structure of the Escherichia coli YcdX protein reveals a trinuclear zinc active site , 2003, Proteins.

[5]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[6]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[7]  Peer Bork,et al.  SMART 5: domains in the context of genomes and networks , 2005, Nucleic Acids Res..

[8]  Jeffrey Hasan,et al.  The Web Services Description Language , 2004 .

[9]  Nan Guo,et al.  PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways , 2006, Nucleic Acids Res..

[10]  Kenneth H. Buetow,et al.  Gene functional similarity search tool (GFSST) , 2006, BMC Bioinformatics.

[11]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt): an expanding universe of protein information , 2005, Nucleic Acids Res..

[12]  Rich Caruana,et al.  Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.

[13]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[14]  Thomas Lengauer,et al.  A new measure for functional similarity of gene products based on Gene Ontology , 2006, BMC Bioinformatics.

[15]  Hongfang Liu,et al.  DynGO: a tool for visualizing and mining of Gene Ontology and its associations , 2005, BMC Bioinformatics.

[16]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[17]  Jean-Louis Romette,et al.  A Structural Basis for the Inhibition of the NS5 Dengue Virus mRNA 2′-O-Methyltransferase Domain by Ribavirin 5′-Triphosphate* , 2004, Journal of Biological Chemistry.

[18]  Cathy H. Wu,et al.  InterPro, progress and status in 2005 , 2004, Nucleic Acids Res..

[19]  Lei Qin,et al.  Semantic search among heterogeneous biological databases based on gene ontology. , 2004, Acta biochimica et biophysica Sinica.

[20]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.