A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

BackgroundIdentification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value.ResultsIn this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), Bottle Neck (BN), Density of Maximum Neighborhood Component (DMNC), Local Average Connectivity-based method (LAC), Sum of ECC (SoECC), Range-Limited Centrality (RL), L-index (LI), Leader Rank (LR), Normalized α-Centrality (NC), and Moduland-Centrality (MC). Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN) is more than 50% when predicting no more than 500 proteins.ConclusionsWe demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

[1]  M. Snyder,et al.  Proteomics: Protein complexes take the bait , 2002, Nature.

[2]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[3]  Yi Pan,et al.  Essential Proteins Discovery from Weighted Protein Interaction Networks , 2010, ISBRA.

[4]  Kristina Lerman,et al.  A Parameterized Centrality Metric for Network Analysis , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Y. Dong,et al.  Systematic functional analysis of the Caenorhabditis elegans genome using RNAi , 2003, Nature.

[6]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[7]  Alpan Raval,et al.  Identifying Hubs in Protein Interaction Networks , 2009, PloS one.

[8]  Hon Wai Leong,et al.  Examination of the relationship between essential genes in PPI network and hub proteins in reverse nearest neighbor topology , 2010, BMC Bioinformatics.

[9]  A. Kudlicki,et al.  Logic of the Yeast Metabolic Cycle: Temporal Compartmentalization of Cellular Processes , 2005, Science.

[10]  Ney Lemke,et al.  Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information , 2009, BMC Bioinformatics.

[11]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[12]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[13]  Jan Paul Medema,et al.  Betulin Is a Potent Anti-Tumor Agent that Is Enhanced by Cholesterol , 2009, PloS one.

[14]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[15]  M. Gerstein,et al.  Genomic analysis of essentiality within protein networks. , 2004, Trends in genetics : TIG.

[16]  H. Bussey,et al.  Large‐scale essential gene identification in Candida albicans and applications to antifungal drug discovery , 2003, Molecular microbiology.

[17]  Robin Palotai,et al.  Community Landscapes: An Integrative Approach to Determine Overlapping Network Module Hierarchy, Identify Key Nodes and Predict Network Dynamics , 2009, PloS one.

[18]  Jianer Chen,et al.  A Fast Agglomerate Algorithm for Mining Functional Modules in Protein Interaction Networks , 2008, 2008 International Conference on BioMedical Engineering and Informatics.

[19]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[20]  J. A. Rodríguez-Velázquez,et al.  Subgraph centrality in complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Sanjay Kumar,et al.  Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi , 2009, BMC Microbiology.

[22]  Nitesh V. Chawla,et al.  Range-limited Centrality Measures in Complex Networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Caroline C. Friedel,et al.  Inferring topology from clustering coefficients in protein-protein interaction networks , 2006, BMC Bioinformatics.

[24]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[25]  Karl W. Broman,et al.  A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: Application to Mycobacterium tuberculosis , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Igor Jurisica,et al.  Functional topology in a network of protein interactions , 2004, Bioinform..

[27]  Yi Pan,et al.  A New Method for Identifying Essential Proteins Based on Edge Clustering Coefficient , 2011, ISBRA.

[28]  D. Ingber,et al.  High-Betweenness Proteins in the Yeast Protein Interaction Network , 2005, Journal of biomedicine & biotechnology.

[29]  P. Stadler,et al.  Centers of complex networks. , 2003, Journal of theoretical biology.

[30]  Insuk Lee,et al.  A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality , 2007, BMC Bioinformatics.

[31]  M. Zelen,et al.  Rethinking centrality: Methods and examples☆ , 1989 .

[32]  Keunwan Park,et al.  Localized network centrality and essentiality in the yeast–protein interaction network , 2009, Proteomics.

[33]  Mike Tyers,et al.  Evolutionary and Physiological Importance of Hub Proteins , 2006, PLoS Comput. Biol..

[34]  Mark Gerstein,et al.  The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics , 2007, PLoS Comput. Biol..

[35]  Chung-Yen Lin,et al.  Hubba: hub objects analyzer—a framework of interactome hubs identification for network biology , 2008, Nucleic Acids Res..

[36]  Ronald W. Davis,et al.  Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. , 1999, Science.

[37]  A. Telcs,et al.  Lobby index in networks , 2008, 0809.0514.

[38]  Yan Lin,et al.  DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes , 2008, Nucleic Acids Res..

[39]  C. Daub,et al.  BMC Systems Biology , 2007 .

[40]  Núria López-Bigas,et al.  Differences in the evolutionary history of disease genes affected by dominant or recessive mutations , 2006, BMC Genomics.

[41]  Huan Wang,et al.  Prediction of Essential Proteins by Integration of PPI Network Topology and Protein Complexes Information , 2011, ISBRA.

[42]  B. Palsson,et al.  Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation , 2005, BMC Microbiology.

[43]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Ronald W. Davis,et al.  Systematic screen for human disease genes in yeast , 2002, Nature Genetics.

[45]  Matthew W. Hahn,et al.  Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. , 2005, Molecular biology and evolution.

[46]  G. Arndt,et al.  Genome‐wide screening for gene function using RNAi in mammalian cells , 2005, Immunology and cell biology.

[47]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[48]  Dianne P. O'Leary,et al.  Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality , 2008, PLoS Comput. Biol..

[49]  G. Zocchi,et al.  Local cooperativity mechanism in the DNA melting transition. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  Yi-Cheng Zhang,et al.  Leaders in Social Networks, the Delicious Case , 2011, PloS one.

[51]  Ernesto Estrada Virtual identification of essential proteins within the protein interaction network of yeast , 2005, Proteomics.

[52]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[53]  Aleksey Y Ogurtsov,et al.  Bioinformatical assay of human gene morbidity. , 2004, Nucleic acids research.

[54]  Yi Pan,et al.  A local average connectivity-based method for identifying essential proteins from the network level , 2011, Comput. Biol. Chem..

[55]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.