A Bayesian approach to construct Context-Specific Gene Ontology: Application to protein function prediction

The annotation of protein provides a considerable knowledge for the biologists in order to understand life at the molecular level. The computational annotation of protein function has therefore emerged as an important alternative given that the biological experiments are extremely laborious. A number of methods have been developed to computationally annotate proteins using standardized nomenclatures such as Gene Ontology. These methods are based on various independency assumptions for modeling the annotation problem. However, the recent network analysis reveals that the same protein with different interactions may perform different functions. In this paper, we take into account the topology of the protein-protein interaction network in order to propose a new representation of functions' ontology. We use the Bayesian network in order to model and to alter the structure of this ontology so as to create the new context specific ontology. We use this newly proposed structure for predicting the functions of the unlabeled proteins. We evaluate our method, called Context-Specific Ontology by the use of the Bayesian Network (ConSOn-BN), on the Saccharomyces cerevisiae protein-protein interaction network and we find that ConSOn-BN has enhanced results as compared to some known methods.

[1]  Kui Zhang,et al.  Prediction of protein function using protein-protein interaction data , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[2]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[3]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[4]  T. Ideker,et al.  A gene ontology inferred from molecular networks , 2012, Nature Biotechnology.

[5]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[6]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[7]  Daniel W. A. Buchan,et al.  A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.

[8]  Stanley Letovsky,et al.  Predicting protein function from protein/protein interaction data: a probabilistic approach , 2003, ISMB.

[9]  Yibo Wu,et al.  GOSemSim: an R package for measuring semantic similarity among GO terms and gene products , 2010, Bioinform..

[10]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[11]  Ting Chen,et al.  Mapping gene ontology to proteins based on protein-protein interaction data , 2004, Bioinform..

[12]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[13]  Olivier Pourret,et al.  Bayesian networks : a practical guide to applications , 2008 .

[14]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[15]  Cajo J. F. ter Braak,et al.  Gene Ontology consistent protein function prediction: the FALCON algorithm applied to six eukaryotic genomes , 2013, Algorithms for Molecular Biology.

[16]  Ting Chen,et al.  Diffusion kernel-based logistic regression models for protein function prediction. , 2006, Omics : a journal of integrative biology.

[17]  Thomas Lengauer,et al.  A new measure for functional similarity of gene products based on Gene Ontology , 2006, BMC Bioinformatics.

[18]  Yiannis Kourmpetis,et al.  Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data , 2010, PloS one.

[19]  Kara Dolinski,et al.  Automating the construction of gene ontologies , 2013, Nature Biotechnology.

[20]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[21]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.