Accessible Protein Interaction Data for Network Modeling. Structure of the Information and Available Repositories

In recent years there has been an incredible explosion of computational studies of molecular biology systems, particularly those related to the analysis of the structure and organization of molecular networks, as the initial steps toward the possible simulation of the behavior of simple cellular systems. Needless to say, this task will not be possible without the availability of a new class of data derived from experimental proteomics. Large-scale application of the yeast two-hybrid system, affinity purification (TAPs-MS), and other methodologies are for the first time providing overviews of complete protein interaction networks. Interestingly a number of computational methods are also contributing substantially to the identification of protein interactions, by comparing genome organization and evolution. Other disciplines, such as structural biology and computational structural biology, are complementing the information on interaction networks by providing detailed molecular descriptions of the corresponding complexes, which will become essential for the direct manipulation of the networks using theoretical or experimental methods. The storage, manipulation and visualization of the huge volumes of information about protein interactions and networks pose similar problems, irrespective of the source of the information: experimental or computational. In this sense, a number of competing systems and emerging standards have appeared in parallel with the publication of the data. In this review, we will provide an overview of the main experimental, high-throughput methods for the study of protein interactions, the parallel developments of computational methods for the prediction of protein interactions based on genome and sequence information, and the development of databases and standards that facilitate the analysis of all this information.

[1]  Natalia Maltsev,et al.  WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction , 2000, Nucleic Acids Res..

[2]  Hui Lu,et al.  Development of unified statistical potentials describing protein-protein interactions. , 2003, Biophysical journal.

[3]  A Valencia,et al.  Distribution and functional diversification of the ras superfamily in Saccharomyces cerevisiae , 1998, FEBS letters.

[4]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[5]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J M Gauthier,et al.  Protein--protein interaction maps: a lead towards cellular functions. , 2001, Trends in genetics : TIG.

[7]  Hui Lu,et al.  Multimeric threading-based prediction of protein-protein interactions on a genomic scale: application to the Saccharomyces cerevisiae proteome. , 2003, Genome research.

[8]  M. Tyers,et al.  The GRID: The General Repository for Interaction Datasets , 2003, Genome Biology.

[9]  Florian Iragne,et al.  IPPRED: Server for Proteins Interactions Inference , 2003, Bioinform..

[10]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[11]  M. Vidal,et al.  Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". , 2001, Genome research.

[12]  Michael Krauthammer,et al.  GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles , 2001, ISMB.

[13]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .

[14]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[15]  R. Russell,et al.  The relationship between sequence and interaction divergence in proteins. , 2003, Journal of molecular biology.

[16]  Alfonso Valencia,et al.  Information extraction in molecular biology , 2002, Briefings Bioinform..

[17]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[18]  Ting Chen,et al.  Assessment of the reliability of protein-protein interactions and protein function prediction , 2002, Pacific Symposium on Biocomputing.

[19]  Arun K. Ramani,et al.  Exploiting the co-evolution of interacting proteins to discover interaction specificity. , 2003, Journal of molecular biology.

[20]  Oliver Niggemann,et al.  Generating protein interaction maps from incomplete data: application to fold assignment , 2001, ISMB.

[21]  L. Castagnoli,et al.  Protein Interaction Networks by Proteome Peptide Scanning , 2004, PLoS biology.

[22]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[23]  E. Bayer,et al.  Species‐specificity of the cohesin‐dockerin interaction between Clostridium thermocellum and Clostridium cellulolyticum: Prediction of specificity determinants of the dockerin domain , 1997, Proteins.

[24]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[25]  R. Russell,et al.  Potential artefacts in protein‐interaction networks , 2002, FEBS letters.

[26]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[27]  Miguel A. Andrade-Navarro,et al.  Classification of protein families and detection of the determinant residues with an improved self-organizing map , 1997, Biological Cybernetics.

[28]  M. Sternberg,et al.  Prediction of protein-protein interactions by docking methods. , 2002, Current opinion in structural biology.

[29]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[30]  E Birney,et al.  The Genome Knowledgebase: a resource for biologists and bioinformaticists. , 2003, Cold Spring Harbor symposia on quantitative biology.

[31]  F. Cohen,et al.  Co-evolution of proteins with their interaction partners. , 2000, Journal of molecular biology.

[32]  A. Valencia,et al.  Similarity of phylogenetic trees as indicator of protein-protein interaction. , 2001, Protein engineering.

[33]  A. Rzhetsky,et al.  Probabilistic prediction of unknown metabolic and signal-transduction networks. , 2001, Genetics.

[34]  A Valencia,et al.  Model of the ran-RCC1 interaction using biochemical and docking experiments. , 1999, Journal of molecular biology.

[35]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 2000, Nucleic Acids Res..

[36]  Peter D. Karp,et al.  The EcoCyc Database , 2002, Nucleic Acids Res..

[37]  T. Earnest,et al.  Crystal Structure of the Ribosome at 5.5 Å Resolution , 2001, Science.

[38]  M. Gerstein,et al.  Global Analysis of Protein Activities Using Proteome Chips , 2001, Science.

[39]  A. Valencia,et al.  Conserved Clusters of Functionally Related Genes in Two Bacterial Genomes , 1997, Journal of Molecular Evolution.

[40]  S. Fields,et al.  Protein analysis on a proteomic scale , 2003, Nature.

[41]  B. Snel,et al.  Function prediction and protein networks. , 2003, Current opinion in cell biology.

[42]  J H Lakey,et al.  Measuring protein-protein interactions. , 1998, Current opinion in structural biology.

[43]  Mark D. Wilkinson,et al.  BioMOBY: An Open Source Biological Web Services Proposal , 2002, Briefings Bioinform..

[44]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[45]  Sandor Vajda,et al.  Protein-protein association kinetics and protein docking. , 2002, Current Opinion in Structural Biology.

[46]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[47]  Byungkyu Brian Park,et al.  HPID: The Human Protein Interaction Database , 2004, Bioinform..

[48]  Evgeni Selkov,et al.  MPW: the Metabolic Pathways Database , 1998, Nucleic Acids Res..

[49]  Alfonso Valencia,et al.  Identification of Conserved Amino Acid Residues in Rat Liver Carnitine Palmitoyltransferase I Critical for Malonyl-CoA Inhibition , 2003, The Journal of Biological Chemistry.

[50]  Mark D'Souza,et al.  Use of contiguity on the chromosome to predict functional coupling , 1998, Silico Biol..

[51]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[52]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[53]  H. Blau,et al.  Protein–protein interactions monitored in mammalian cells via complementation of β-lactamase enzyme fragments , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[55]  Alfred Wittinghofer,et al.  Structural Basis for Guanine Nucleotide Exchange on Ran by the Regulator of Chromosome Condensation (RCC1) , 2001, Cell.

[56]  Jérôme Wojcik,et al.  Protein-protein interaction map inference using interacting domain profile pairs , 2001, ISMB.

[57]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[58]  Terry Gaasterland,et al.  The metabolic pathway collection from EMP: the enzymes and metabolic pathways database , 1996, Nucleic Acids Res..

[59]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[60]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[61]  P. Bork,et al.  Genome evolution reveals biochemical networks and functional modules , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[62]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[63]  K. J. Fryxell,et al.  The coevolution of gene family trees. , 1996, Trends in genetics : TIG.

[64]  Robert B. Russell,et al.  InterPreTS: protein Interaction Prediction through Tertiary Structure , 2003, Bioinform..

[65]  Sophia Tsoka,et al.  Prediction of protein interactions: metabolic enzymes are frequently involved in gene fusion , 2000, Nature Genetics.

[66]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[67]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[68]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[69]  Alfonso Valencia,et al.  Identification of amino acid residues crucial for chemokine receptor dimerization , 2004, Nature Immunology.

[70]  W. Andrew,et al.  LO, and A. , 1988 .

[71]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[72]  Miguel A. Andrade-Navarro,et al.  Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions , 1999, ISMB.

[73]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[74]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[75]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[76]  A. Valencia,et al.  Protein interaction: same network, different hubs. , 2003, Trends in genetics : TIG.

[77]  T. Ito,et al.  Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[78]  A. Valencia,et al.  In silico two‐hybrid system for the selection of physically interacting protein pairs , 2002, Proteins.

[79]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[80]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[81]  T. Gaasterland,et al.  Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes. , 1998, Microbial & comparative genomics.

[82]  J. Wojcik,et al.  The protein–protein interaction map of Helicobacter pylori , 2001, Nature.

[83]  A. Valencia,et al.  Correlated mutations contain information about protein-protein interaction. , 1997, Journal of molecular biology.

[84]  Matteo Pellegrini,et al.  Prolinks: a database of protein functional linkages derived from coevolution , 2004, Genome Biology.

[85]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[86]  Charles DeLisi,et al.  Predictome: a database of putative functional links between proteins , 2002, Nucleic Acids Res..

[87]  E. Winzeler,et al.  Treasures and traps in genome-wide data sets: case examples from yeast , 2002, Nature Reviews Genetics.

[88]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[89]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[90]  John J. Wyrick,et al.  Genome-wide location and function of DNA binding proteins. , 2000, Science.

[91]  Patrick Aloy,et al.  Interrogating protein interaction networks through structural biology , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[92]  Susumu Goto,et al.  The KEGG databases at GenomeNet , 2002, Nucleic Acids Res..

[93]  Igor Stagljar,et al.  Analysis of membrane protein interactions using yeast-based technologies. , 2002, Trends in biochemical sciences.

[94]  Gary D Bader,et al.  BIND--The Biomolecular Interaction Network Database. , 2001, Nucleic acids research.

[95]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[96]  I R Vetter,et al.  Effector Recognition by the Small GTP-binding Proteins Ras and Ral* , 1999, The Journal of Biological Chemistry.

[97]  Hui Lu,et al.  MULTIPROSPECTOR: An algorithm for the prediction of protein–protein interactions by multimeric threading , 2002, Proteins.

[98]  A. Valencia,et al.  Computational methods for the prediction of protein interactions. , 2002, Current opinion in structural biology.

[99]  Martin Vingron,et al.  IntAct: an open source molecular interaction database , 2004, Nucleic Acids Res..

[100]  A. E. Hirsh,et al.  Evolutionary Rate in the Protein Interaction Network , 2002, Science.

[101]  B. Snel,et al.  The identification of functional modules from the genomic association of genes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[102]  S. Wodak,et al.  Representing and Analysing Molecular and Cellular Function Using the Computer , 2000, Biological chemistry.

[103]  Andrey Rzhetsky,et al.  Towards the Prediction of Complete Protein-Protein Interaction Networks , 2001, Pacific Symposium on Biocomputing.

[104]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[105]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[106]  H E Stanley,et al.  Classes of small-world networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.