TGF-beta signaling proteins and the Protein Ontology

BackgroundThe Protein Ontology (PRO) is designed as a formal and principled Open Biomedical Ontologies (OBO) Foundry ontology for proteins. The components of PRO extend from a classification of proteins on the basis of evolutionary relationships at the homeomorphic level to the representation of the multiple protein forms of a gene, including those resulting from alternative splicing, cleavage and/or post-translational modifications. Focusing specifically on the TGF-beta signaling proteins, we describe the building, curation, usage and dissemination of PRO.ResultsPRO is manually curated on the basis of PrePRO, an automatically generated file with content derived from standard protein data sources. Manual curation ensures that the treatment of the protein classes and the internal and external relationships conform to the PRO framework. The current release of PRO is based upon experimental data from mouse and human proteins wherein equivalent protein forms are represented by single terms. In addition to the PRO ontology, the annotation of PRO terms is released as a separate PRO association file, which contains, for each given PRO term, an annotation from the experimentally characterized sub-types as well as the corresponding database identifiers and sequence coordinates. The annotations are added in the form of relationship to other ontologies. Whenever possible, equivalent forms in other species are listed to facilitate cross-species comparison. Splice and allelic variants, gene fusion products and modified protein forms are all represented as entities in the ontology. Therefore, PRO provides for the representation of protein entities and a resource for describing the associated data. This makes PRO useful both for proteomics studies where isoforms and modified forms must be differentiated, and for studies of biological pathways, where representations need to take account of the different ways in which the cascade of events may depend on specific protein modifications.ConclusionPRO provides a framework for the formal representation of protein classes and protein forms in the OBO Foundry. It is designed to enable data retrieval and integration and machine reasoning at the molecular level of proteins, thereby facilitating cross-species comparisons, pathway analysis, disease modeling and the generation of new hypotheses.

[1]  H T Lynch,et al.  The prevalence of MADH4 and BMPR1A mutations in juvenile polyposis and absence of BMPR2, BMPR1B, and ACVR1 mutations , 2004, Journal of Medical Genetics.

[2]  Nan Guo,et al.  PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways , 2006, Nucleic Acids Res..

[3]  Midori A. Harris,et al.  OBO-Edit - an ontology editor for biologists , 2007, Bioinform..

[4]  Cathy H. Wu,et al.  Framework for a Protein Ontology , 2006, TMBIO '06.

[5]  Susumu Goto,et al.  The KEGG databases at GenomeNet , 2002, Nucleic Acids Res..

[6]  Toshihisa Takagi,et al.  Event Ontology: A Pathway-Centric Ontology for Biological Processes , 2006, Pacific Symposium on Biocomputing.

[7]  Barry Smith,et al.  An improved ontological representation of dendritic cells as a paradigm for all cell types , 2009, BMC Bioinformatics.

[8]  Michael D. Schneider,et al.  Activation of Rho-associated coiled-coil protein kinase 1 (ROCK-1) by caspase-3 cleavage plays an essential role in cardiac myocyte apoptosis , 2006, Proceedings of the National Academy of Sciences.

[9]  J. Bertoglio,et al.  Caspase-3-mediated cleavage of ROCK I induces MLC phosphorylation and apoptotic membrane blebbing , 2001, Nature Cell Biology.

[10]  Amit P. Sheth,et al.  Knowledge modeling and its application in life sciences: a tale of two ontologies , 2006, WWW '06.

[11]  R. Durbin,et al.  The Sequence Ontology: a tool for the unification of genome annotations , 2005, Genome Biology.

[12]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): mouse biology and model systems , 2007, Nucleic Acids Res..

[13]  Eduard H. Hovy Annotation , 1942, Glasgow Medical Journal.

[14]  L. Aaltonen,et al.  The prevalence of MADH 4 and BMPR 1 A mutations in juvenile polyposis and absence of BMPR 2 , BMPR 1 B , and ACVR 1 mutations , 2004 .

[15]  Xu Cao,et al.  Endofin acts as a Smad anchor for receptor activation in BMP signaling , 2007, Journal of Cell Science.

[16]  Cathy H. Wu,et al.  Functional Annotation of Protein Isoforms and Modified Forms , 2008, BIOCOMP.

[17]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[18]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[19]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[20]  Robert S. Ledley,et al.  PIRSF: family classification system at the Protein Information Resource , 2004, Nucleic Acids Res..

[21]  Chris F. Taylor,et al.  Survey-based naming conventions for use in OBO Foundry ontology development , 2009, BMC Bioinformatics.

[22]  Zhang-Zhi Hu,et al.  The iProClass integrated database for protein functional analysis , 2004, Comput. Biol. Chem..