Unraveling protein interaction networks with near-optimal efficiency

The functional characterization of genes and their gene products is the main challenge of the genomic era. Examining interaction information for every gene product is a direct way to assemble the jigsaw puzzle of proteins into a functional map. Here we demonstrate a method in which the information gained from pull-down experiments, in which single proteins act as baits to detect interactions with other proteins, is maximized by using a network-based strategy to select the baits. Because of the scale-free distribution of protein interaction networks, we were able to obtain fast coverage by focusing on highly connected nodes (hubs) first. Unfortunately, locating hubs requires prior global information about the network one is trying to unravel. Here, we present an optimized 'pay-as-you-go' strategy that identifies highly connected nodes using only local information that is collected as successive pull-down experiments are performed. Using this strategy, we estimate that 90% of the human interactome can be covered by 10,000 pull-down experiments, with 50% of the interactions confirmed by reciprocal pull-down experiments.

[1]  Raymond E. Miller,et al.  Complexity of Computer Computations , 1972 .

[2]  J. Weber,et al.  Human whole-genome shotgun sequencing. , 1997, Genome research.

[3]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[4]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[5]  Gary D. Bader,et al.  BIND-a data specification for storing and describing biomolecular interactions, molecular complexes and pathways , 2000, Bioinform..

[6]  Oliver Niggemann,et al.  Generating protein interaction maps from incomplete data: application to fold assignment , 2001, ISMB.

[7]  Chris Sander,et al.  Completeness in structural genomics , 2001, Nature Structural Biology.

[8]  G. Church,et al.  Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae , 2001, Nature Genetics.

[9]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Sergei Maslov,et al.  Protein interaction networks beyond artifacts , 2002, FEBS letters.

[11]  Jaak Vilo,et al.  Building and analysing genome-wide gene disruption networks , 2002, ECCB.

[12]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[13]  M Karplus,et al.  Small-world view of the amino acids that play a key role in protein folding. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[15]  R. Russell,et al.  Potential artefacts in protein‐interaction networks , 2002, FEBS letters.

[16]  D. Holste,et al.  Does mapping reveal correlation between gene expression and protein–protein interaction? , 2003, Nature Genetics.

[17]  Rolf Apweiler,et al.  Progress in Establishing Common Standards for Exchanging Proteomics Data: The Second Meeting of the HUPO Proteomics Standards Initiative , 2003, Comparative and functional genomics.

[18]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Alessandro Vespignani,et al.  Global protein function prediction from protein-protein interaction networks , 2003, Nature Biotechnology.

[20]  Yoshihide Hayashizaki,et al.  Construction of reliable protein-protein interaction networks with a new interaction generality measure , 2003, Bioinform..

[21]  T. Earnest,et al.  From words to literature in structural proteomics , 2003, Nature.