Predicting protein complex membership using probabilistic network reliability.

Evidence for specific protein-protein interactions is increasingly available from both small- and large-scale studies, and can be viewed as a network. It has previously been noted that errors are frequent among large-scale studies, and that error frequency depends on the large-scale method used. Despite knowledge of the error-prone nature of interaction evidence, edges (connections) in this network are typically viewed as either present or absent. However, use of a probabilistic network that considers quantity and quality of supporting evidence should improve inference derived from protein networks. Here we demonstrate inference of membership in a partially known protein complex by using a probabilistic network model and an algorithm previously used to evaluate reliability in communication networks.

[1]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[2]  Michael O. Ball,et al.  Computational Complexity of Network Reliability Analysis: An Overview , 1986, IEEE Transactions on Reliability.

[3]  Charles J. Colbourn,et al.  The Combinatorics of Network Reliability , 1987 .

[4]  K. Struhl,et al.  NOT1(CDC39), NOT2(CDC36), NOT3, and NOT4 encode a global-negative regulator of transcription that differentially affects TATA-element utilization. , 1994, Genes & development.

[5]  R. Kobayashi,et al.  Characterization of the five replication factor C genes of Saccharomyces cerevisiae , 1995, Molecular and cellular biology.

[6]  David R. Karger,et al.  A randomized fully polynomial time approximation scheme for the all terminal network reliability problem , 1995, STOC '95.

[7]  John R Yates,et al.  A Subset of TAFIIs Are Integral Components of the SAGA Complex Required for Nucleosome Acetylation and Transcriptional Stimulation , 1998, Cell.

[8]  M. Collart,et al.  Characterization of NOT5 that encodes a new component of the Not protein complex. , 1998, Gene.

[9]  James I. Garrels,et al.  The Yeast Proteome Database (YPD): a model for the organization and presentation of genome-wide functional data , 1999, Nucleic Acids Res..

[10]  Vladimir Batagelj,et al.  Pajek - Program for Large Network Analysis , 1999 .

[11]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[12]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[13]  Rong Li,et al.  Genetic dissection of the budding yeast Arp2/3 complex: a comparison of the in vivo and structural roles of individual subunits. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Z. Kelman,et al.  Studies on the interactions between human replication factor C and human proliferating cell nuclear antigen. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[15]  David R. Karger A Randomized Fully Polynomial Time Approximation Scheme for the All-Terminal Network Reliability Problem , 1999, SIAM J. Comput..

[16]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[17]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[18]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[19]  F. Spencer,et al.  Saccharomyces cerevisiae CTF18 and CTF4 Are Required for Sister Chromatid Cohesion , 2001, Molecular and Cellular Biology.

[20]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  H. Herzel,et al.  Is there a bias in proteome research? , 2001, Genome research.

[22]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[23]  R. Parker,et al.  Ccr4p is the catalytic subunit of a Ccr4p/Pop2p/Notp mRNA deadenylase complex in Saccharomyces cerevisiae , 2002, The EMBO journal.

[24]  Andrew J. Link,et al.  Proteomics of the Eukaryotic Transcription Machinery: Identification of Proteins Associated with Components of Yeast TFIID by Multidimensional Mass Spectrometry , 2002, Molecular and Cellular Biology.

[25]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[26]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[27]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[28]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[29]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[30]  Stanley Letovsky,et al.  Predicting protein function from protein/protein interaction data: a probabilistic approach , 2003, ISMB.

[31]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[32]  Joel S. Bader,et al.  Greedily building protein networks with confidence , 2003, Bioinform..

[33]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Jerzy Majka,et al.  Yeast Rad17/Mec3/Ddc1: A sliding clamp for the DNA damage checkpoint , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[35]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[36]  Ting Chen,et al.  Assessment of the reliability of protein-protein interactions and protein function prediction , 2002, Pacific Symposium on Biocomputing.

[37]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[39]  Charles Boone,et al.  Elg1 forms an alternative RFC complex important for DNA replication and genome integrity , 2003, The EMBO journal.

[40]  Mark Gerstein,et al.  Bridging structural biology and genomics: assessing protein interaction data with known complexes. , 2002, Drug discovery today.

[41]  M. Gerstein,et al.  Integration of genomic datasets to predict protein complexes in yeast , 2004, Journal of Structural and Functional Genomics.