A Unified Scoring Scheme for Detecting Essential Proteins in Protein Interaction Networks

The essentiality of a gene or protein is important for understanding the minimal requirements for cellular survival and development. Numerous computational methodologies have been proposed to detect essential proteins from large protein-protein interactions (PPI) datasets. However, only a handful of overlapping essential proteins exists between them. This suggests that the methods may be complementary and an integration scheme which exploits the differences should better detect essential proteins. We introduce a novel algorithm, UniScore, which combines predictions produced by existing methods. Experimental results on four Saccharomyces cerevisiae PPI datasets showed that UniScore consistently produced significantly better predictions and substantially outperforming SVM which is one of the most popular and advanced classification technique. In addition, previously hard-to-detect low-connectivity essential proteins have also been identified by UniScore.

[1]  M. Gerstein,et al.  Genomic analysis of essentiality within protein networks. , 2004, Trends in genetics : TIG.

[2]  Laurence D. Hurst,et al.  Do essential genes evolve slowly? , 1999, Current Biology.

[3]  Ronald W. Davis,et al.  Systematic screen for human disease genes in yeast , 2002, Nature Genetics.

[4]  Matthew W. Hahn,et al.  Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. , 2005, Molecular biology and evolution.

[5]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[6]  T. Ohta,et al.  On some principles governing molecular evolution. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[8]  Núria López-Bigas,et al.  Differences in the evolutionary history of disease genes affected by dominant or recessive mutations , 2006, BMC Genomics.

[9]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[10]  Aleksey Y Ogurtsov,et al.  Bioinformatical assay of human gene morbidity. , 2004, Nucleic acids research.

[11]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[13]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[14]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[15]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[16]  Xiaomei Wu,et al.  Prediction of yeast protein–protein interaction network: insights from the Gene Ontology and annotations , 2006, Nucleic acids research.

[17]  Ben-Yang Liao,et al.  Impacts of gene essentiality, expression pattern, and gene compactness on the evolutionary rate of mammalian proteins. , 2006, Molecular biology and evolution.

[18]  P. Philippsen,et al.  New heterologous modules for classical or PCR‐based gene disruptions in Saccharomyces cerevisiae , 1994, Yeast.

[19]  T J White,et al.  Biochemical evolution. , 1977, Annual review of biochemistry.

[20]  Hawoong Jeong,et al.  Prediction of Protein Essentiality Based on Genomic Data , 2002, Complexus.

[21]  O. Ozier-Kalogeropoulos,et al.  A simple and efficient method for direct gene deletion in Saccharomyces cerevisiae. , 1993, Nucleic acids research.

[22]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[23]  C. Pál,et al.  Genomic function: Rate of evolution and gene dispensability. , 2003, Nature.

[24]  Soon-Heng Tan,et al.  FUNCTIONAL CENTRALITY: DETECTING LETHALITY OF PROTEINS IN PROTEIN INTERACTION NETWORKS , 2007 .

[25]  Wen-Hsiung Li,et al.  Rate of protein evolution versus fitness effect of gene deletion. , 2003, Molecular biology and evolution.

[26]  C. Dwork,et al.  Rank Aggregation Revisited , 2002 .

[27]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[28]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[29]  Dan Roth,et al.  An Unsupervised Learning Algorithm for Rank Aggregation , 2007, ECML.

[30]  Michael R. Seringhaus,et al.  Predicting essential genes in fungal genomes. , 2006, Genome research.

[31]  Jianzhi Zhang,et al.  Significant impact of protein dispensability on the instantaneous rate of protein evolution. , 2005, Molecular biology and evolution.