Detecting Essential Proteins Based on Network Topology, Gene Expression Data, and Gene Ontology Information

The identification of essential proteins in protein-protein interaction (PPI) networks is of great significance for understanding cellular processes. With the increasing availability of large-scale PPI data, numerous centrality measures based on network topology have been proposed to detect essential proteins from PPI networks. However, most of the current approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology annotation information. In this paper, we propose a novel centrality measure, called TEO, for identifying essential proteins by combining network topology, gene expression profiles, and GO information. To evaluate the performance of the TEO method, we compare it with five other methods (degree, betweenness, NC, Pec, and CowEWC) in detecting essential proteins from two different yeast PPI datasets. The simulation results show that adding GO information can effectively improve the predicted precision and that our method outperforms the others in predicting essential proteins.

[1]  Fang-Xiang Wu,et al.  Domain control of nonlinear networked systems and applications to complex disease networks , 2017 .

[2]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..

[3]  Sanjay Kumar,et al.  Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi , 2009, BMC Microbiology.

[4]  Dao-Qing Dai,et al.  Detecting overlapping protein complexes based on a generative model with functional and topological properties , 2014, BMC Bioinformatics.

[5]  A. Clatworthy,et al.  Targeting virulence: a new paradigm for antimicrobial therapy , 2007, Nature Chemical Biology.

[6]  Thomas Lengauer,et al.  Improving disease gene prioritization using the semantic similarity of Gene Ontology terms , 2010, Bioinform..

[7]  Xiufen Zou,et al.  Deciphering deterioration mechanisms of complex diseases based on the construction of dynamic networks and systems analysis , 2015, Scientific Reports.

[8]  Desmond J. Higham,et al.  Geometric De-noising of Protein-Protein Interaction Networks , 2009, PLoS Comput. Biol..

[9]  Y. Dong,et al.  Systematic functional analysis of the Caenorhabditis elegans genome using RNAi , 2003, Nature.

[10]  Mitsuhiro Itaya,et al.  An estimation of minimal genome size required for life , 1995, FEBS letters.

[11]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[12]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[13]  Liang Chen,et al.  Essential protein identification based on essential protein-protein interaction prediction by integrated edge weights , 2014, BIBM.

[14]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[15]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[16]  Jin Xu,et al.  A New Method for the Discovery of Essential Proteins , 2013, PloS one.

[17]  Xiufen Zou,et al.  Detecting Essential Proteins Based on Network Topology, Gene Expression Data and Gene Ontology Information. , 2016, IEEE/ACM transactions on computational biology and bioinformatics.

[18]  Yi Pan,et al.  A local average connectivity-based method for identifying essential proteins from the network level , 2011, Comput. Biol. Chem..

[19]  C. Mungall,et al.  Gene Ontology Consortium : going forward The Gene Ontology , 2015 .

[20]  Xinghuo Yu,et al.  Identification and Evolution of Structurally Dominant Nodes in Protein-Protein Interaction Networks , 2014, IEEE Transactions on Biomedical Circuits and Systems.

[21]  Hongfei Lin,et al.  Construction of Ontology Augmented Networks for Protein Complex Prediction , 2013, PloS one.

[22]  Xiufen Zou,et al.  Detecting Essential Proteins Based on Network Topology, Gene Expression Data, and Gene Ontology Information , 2018, TCBB.

[23]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[24]  Mark Gerstein,et al.  Bridging structural biology and genomics: assessing protein interaction data with known complexes. , 2002, Trends in genetics : TIG.

[25]  Gary D. Bader,et al.  The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function , 2010, Nucleic Acids Res..

[26]  Yan Lin,et al.  DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes , 2008, Nucleic Acids Res..

[27]  Yi Pan,et al.  Identifying essential proteins from active PPI networks constructed with dynamic gene expression , 2015, BMC Genomics.

[28]  A. Kudlicki,et al.  Logic of the Yeast Metabolic Cycle: Temporal Compartmentalization of Cellular Processes , 2005, Science.

[29]  R. Kaul,et al.  A comprehensive transposon mutant library of Francisella novicida, a bioweapon surrogate , 2007, Proceedings of the National Academy of Sciences.

[30]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[31]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[32]  Yan Wang,et al.  Essential protein identification based on essential protein-protein interaction prediction by integrated edge weights , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[33]  Xiufen Zou,et al.  A New Method for Detecting Protein Complexes based on the Three Node Cliques , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Yi Pan,et al.  A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data , 2012, BMC Systems Biology.

[35]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[36]  Enrico Blanzieri,et al.  Identification of Essential Proteins Based on Ranking Edge-Weights in Protein-Protein Interaction Networks , 2014, PloS one.

[37]  Jiawei Luo,et al.  Identification of Essential Proteins Based on a New Combination of Local Interaction Density and Protein Complexes , 2015, PloS one.

[38]  Xiufen Zou,et al.  Negative feedback contributes to the stochastic expression of the interferon-β gene in virus-triggered type I interferon signaling pathways. , 2015, Mathematical biosciences.

[39]  Yi Pan,et al.  Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks , 2012, BMC Systems Biology.

[40]  Yi Pan,et al.  Essential Protein Discovery Based on Network Motif and Gene Ontology , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[41]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[43]  Yi Pan,et al.  Identification of Essential Proteins Based on Edge Clustering Coefficient , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[44]  Yi Pan,et al.  Prediction of Essential Proteins Based on Overlapping Essential Modules , 2014, IEEE Transactions on NanoBioscience.

[45]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[46]  Yi Pan,et al.  A New Method for Identifying Essential Proteins Based on Edge Clustering Coefficient , 2011, ISBRA.

[47]  D. Ingber,et al.  High-Betweenness Proteins in the Yeast Protein Interaction Network , 2005, Journal of biomedicine & biotechnology.