Categorizing Biases in High-Confidence High-Throughput Protein-Protein Interaction Data Sets*

We characterized and evaluated the functional attributes of three yeast high-confidence protein-protein interaction data sets derived from affinity purification/mass spectrometry, protein-fragment complementation assay, and yeast two-hybrid experiments. The interacting proteins retrieved from these data sets formed distinct, partially overlapping sets with different protein-protein interaction characteristics. These differences were primarily a function of the deployed experimental technologies used to recover these interactions. This affected the total coverage of interactions and was especially evident in the recovery of interactions among different functional classes of proteins. We found that the interaction data obtained by the yeast two-hybrid method was the least biased toward any particular functional characterization. In contrast, interacting proteins in the affinity purification/mass spectrometry and protein-fragment complementation assay data sets were over- and under-represented among distinct and different functional categories. We delineated how these differences affected protein complex organization in the network of interactions, in particular for strongly interacting complexes (e.g. RNA and protein synthesis) versus weak and transient interacting complexes (e.g. protein transport). We quantified methodological differences in detecting protein interactions from larger protein complexes, in the correlation of protein abundance among interacting proteins, and in their connectivity of essential proteins. In the latter case, we showed that minimizing inherent methodology biases removed many of the ambiguous conclusions about protein essentiality and protein connectivity. We used these findings to rationalize how biological insights obtained by analyzing data sets originating from different sources sometimes do not agree or may even contradict each other. An important corollary of this work was that discrepancies in biological insights did not necessarily imply that one detection methodology was better or worse, but rather that, to a large extent, the insights reflected the methodological biases themselves. Consequently, interpreting the protein interaction data within their experimental or cellular context provided the best avenue for overcoming biases and inferring biological knowledge.

[1]  Robert Gentleman,et al.  Making the most of high-throughput protein-interaction data , 2007, Genome Biology.

[2]  Fabian J. Theis,et al.  MIPS: curated databases and comprehensive secondary data resources in 2010 , 2010, Nucleic Acids Res..

[3]  Nicolas Thierry-Mieg,et al.  New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size , 2010, BMC Bioinformatics.

[4]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[5]  Andrzej Kloczkowski,et al.  Functional clustering of yeast proteins from the protein-protein interaction network , 2006, BMC Bioinformatics.

[6]  Gustav Ammerer,et al.  Cooperation between the INO80 Complex and Histone Chaperones Determines Adaptation of Stress Gene Transcription in the Yeast Saccharomyces cerevisiae , 2009, Molecular and Cellular Biology.

[7]  Stephen W. Michnick,et al.  Universal strategies in research and drug discovery based on protein-fragment complementation assays , 2007, Nature Reviews Drug Discovery.

[8]  Peter Uetz,et al.  Benchmarking yeast two‐hybrid systems using the interactions of bacterial motility proteins , 2009, Proteomics.

[9]  James Vlasblom,et al.  Challenges and Rewards of Interaction Proteomics * , 2009, Molecular & Cellular Proteomics.

[10]  Daphne Koller,et al.  A Complex-based Reconstruction of the Saccharomyces cerevisiae Interactome *S⃞ , 2009, Molecular & Cellular Proteomics.

[11]  Jaques Reifman,et al.  Unraveling the conundrum of seemingly discordant protein-protein interaction datasets , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[12]  U. Stelzl,et al.  The value of high quality protein-protein interaction networks for systems biology. , 2006, Current opinion in chemical biology.

[13]  S. Lovell,et al.  Protein-protein interaction networks and biology—what's the connection? , 2008, Nature Biotechnology.

[14]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[15]  Jaques Reifman,et al.  A Novel Scoring Approach for Protein Co-Purification Data Reveals High Interaction Specificity , 2009, PLoS Comput. Biol..

[16]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[17]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[18]  Ron Shamir,et al.  Identification of functional modules using network topology and high-throughput data , 2007, BMC Systems Biology.

[19]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[20]  Arun K. Ramani,et al.  How complete are current yeast and human protein-interaction networks? , 2006, Genome Biology.

[21]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[22]  G. Felsenfeld,et al.  Chromatin remodeling by RNA polymerases. , 2004, Trends in biochemical sciences.

[23]  Zelmina Lubovac,et al.  Combining functional and topological properties to identify core modules in protein interaction networks , 2006, Proteins.

[24]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[25]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[26]  Matthew W. Hahn,et al.  Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. , 2005, Molecular biology and evolution.

[27]  Hilla Peretz,et al.  Ju n 20 03 Schrödinger ’ s Cat : The rules of engagement , 2003 .

[28]  Uwe Schlattner,et al.  Yeast Two-Hybrid, a Powerful Tool for Systems Biology , 2009, International journal of molecular sciences.

[29]  J. Reifman,et al.  Influence of Protein Abundance on High-Throughput Protein-Protein Interaction Detection , 2009, PloS one.

[30]  Hong Qian,et al.  Free-energy distribution of binary protein-protein binding suggests cross-species interactome differences. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[31]  L. A. Stargell,et al.  Activation of a Poised RNAPII-Dependent Promoter Requires Both SAGA and Mediator , 2010, Genetics.

[32]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[33]  A. Hinnebusch,et al.  Interdependent Recruitment of SAGA and Srb Mediator by Transcriptional Activator Gcn4p , 2005, Molecular and Cellular Biology.

[34]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.

[35]  Albert-László Barabási,et al.  Error and attack tolerance of complex networks , 2000, Nature.

[36]  Ruth Nussinov,et al.  Energetic determinants of protein binding specificity: Insights into protein interaction networks , 2009, Proteomics.

[37]  Jaques Reifman,et al.  Probing the Extent of Randomness in Protein Interaction Networks , 2008, PLoS Comput. Biol..

[38]  Margaret E. Johnson,et al.  Nonspecific binding limits the number of proteins in a cell and shapes their interaction networks , 2010, Proceedings of the National Academy of Sciences.

[39]  M. Gerstein,et al.  Relating whole-genome expression data with protein-protein interactions. , 2002, Genome research.

[40]  C. Landry,et al.  An in Vivo Map of the Yeast Protein Interactome , 2008, Science.

[41]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[42]  P. Bork,et al.  Evolution of biomolecular networks — lessons from metabolic and protein interactions , 2009, Nature Reviews Molecular Cell Biology.

[43]  Dianne P. O'Leary,et al.  Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality , 2008, PLoS Comput. Biol..

[44]  Igor Jurisica,et al.  Functional topology in a network of protein interactions , 2004, Bioinform..

[45]  D. Whelan,et al.  THE PROMISE ( AND PERIL ) , 2017 .

[46]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[47]  Insuk Lee,et al.  A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality , 2007, BMC Bioinformatics.

[48]  Sean R. Collins,et al.  Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae*S , 2007, Molecular & Cellular Proteomics.

[49]  Andrzej Kloczkowski,et al.  Structural interpretation of protein-protein interaction network , 2010, BMC Structural Biology.

[50]  Mona Singh,et al.  Toward the dynamic interactome: it's about time , 2010, Briefings Bioinform..

[51]  J. Rothberg,et al.  Gaining confidence in high-throughput protein interaction networks , 2004, Nature Biotechnology.

[52]  S. Fields High‐throughput two‐hybrid analysis , 2005, The FEBS journal.

[53]  Jamie Snider,et al.  Interactive proteomics research technologies: recent applications and advances. , 2011, Current opinion in biotechnology.

[54]  Song Tan,et al.  Structural and Functional Conservation of the NuA4 Histone Acetyltransferase Complex from Yeast to Humans , 2004, Molecular and Cellular Biology.

[55]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[56]  Mike Tyers,et al.  Evolutionary and Physiological Importance of Hub Proteins , 2006, PLoS Comput. Biol..

[57]  M. Brand,et al.  Three-dimensional structures of the TAFII-containing complexes TFIID and TFTC. , 1999, Science.

[58]  B. Alberts The Cell as a Collection of Protein Machines: Preparing the Next Generation of Molecular Biologists , 1998, Cell.

[59]  S. Saha,et al.  The bait compatibility index: computational bait selection for interaction proteomics experiments. , 2010, Journal of proteome research.

[60]  J. Derisi,et al.  Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise , 2006, Nature.