Computational Prediction of Protein–Protein Interaction Networks: Algo-rithms and Resources

Protein interactions play an important role in the discovery of protein functions and pathways in biological processes. This is especially true in case of the diseases caused by the loss of specific protein-protein interactions in the organism. The accuracy of experimental results in finding protein-protein interactions, however, is rather dubious and high throughput experimental results have shown both high false positive beside false negative information for protein interaction. Computational methods have attracted tremendous attention among biologists because of the ability to predict protein-protein interactions and validate the obtained experimental results. In this study, we have reviewed several computational methods for protein-protein interaction prediction as well as describing major databases, which store both predicted and detected protein-protein interactions, and the tools used for analyzing protein interaction networks and improving protein-protein interaction reliability.

[1]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[2]  Gary D Bader,et al.  BMC Biology BioMed Central , 2007 .

[3]  Kirill Evlampiev,et al.  Conservation and topology of protein interaction networks under duplication-divergence evolution , 2008, Proceedings of the National Academy of Sciences.

[4]  K. Gunsalus,et al.  Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network , 2009, Nature Methods.

[5]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[6]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[8]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[9]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[10]  Vladimir Batagelj,et al.  Pajek - Program for Large Network Analysis , 1999 .

[11]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[12]  D. J. Clarke,et al.  DNA Topoisomerases , 2009, Methods in Molecular Biology™.

[13]  Benjamin A. Shoemaker,et al.  Deciphering Protein–Protein Interactions. Part I. Experimental Techniques and Databases , 2007, PLoS Comput. Biol..

[14]  Mong-Li Lee,et al.  Increasing confidence of protein interactomes using network topological metrics , 2006, Bioinform..

[15]  Byungkyu Brian Park,et al.  Visualization and analysis of protein interactions , 2003, Bioinform..

[16]  Karthik Raman,et al.  Construction and analysis of protein–protein interaction networks , 2010, Automated experimentation.

[17]  Hongbo Zhu,et al.  NOXclass: prediction of protein-protein interaction types , 2006, BMC Bioinformatics.

[18]  Yanjun Qi,et al.  Random Forest Similarity for Protein-Protein Interaction Prediction from Multiple Sources , 2004, Pacific Symposium on Biocomputing.

[19]  Andre Skusa,et al.  Extraction of biological interaction networks from scientific literature , 2005, Briefings Bioinform..

[20]  Yungki Park,et al.  Revisiting the negative example sampling problem for predicting protein-protein interactions , 2011, Bioinform..

[21]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[22]  William Stafford Noble,et al.  Large-scale prediction of protein-protein interactions from structures , 2010, BMC Bioinformatics.

[23]  Benjamin A. Shoemaker,et al.  Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners , 2007, PLoS Comput. Biol..

[24]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[25]  Carlos Prieto,et al.  APID: Agile Protein Interaction DataAnalyzer , 2006, Nucleic Acids Res..

[26]  D. Eisenberg,et al.  Describing Biological Protein Interactions in Terms of Protein States and State Transitions , 2002, Molecular & Cellular Proteomics.

[27]  Vladimir Batagelj,et al.  Pajek - Analysis and Visualization of Large Networks , 2004, Graph Drawing Software.

[28]  M. Tyers,et al.  Osprey: a network visualization system , 2003, Genome Biology.

[29]  Huiru Zheng,et al.  GRIP: A web-based system for constructing Gold Standard datasets for protein-protein interaction prediction , 2008, Source Code for Biology and Medicine.

[30]  C. Bagowski,et al.  The Nature of Protein Domain Evolution: Shaping the Interaction Network , 2010, Current genomics.

[31]  Gunnar Rätsch,et al.  Support Vector Machines and Kernels for Computational Biology , 2008, PLoS Comput. Biol..

[32]  R. Apweiler,et al.  MINT and IntAct contribute to the Second BioCreative challenge: serving the text-mining community with high quality molecular interaction data , 2008, Genome Biology.

[33]  Mark Gerstein,et al.  Bridging structural biology and genomics: assessing protein interaction data with known complexes. , 2002, Drug discovery today.

[34]  Byoung-Tak Zhang,et al.  PIE: an online prediction system for protein–protein interactions from text , 2008, Nucleic Acids Res..

[35]  Zheng Rong Yang,et al.  Machine Learning Approaches to Bioinformatics , 2010, Science, Engineering, and Biology Informatics.

[36]  Chunguang Zhou,et al.  Predicting protein-protein interactions based on BP neural network , 2007, 2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[37]  Toshihisa Takagi,et al.  Improving the Performance of an SVM-Based Method for Predicting Protein-Protein Interactions , 2006, Silico Biol..

[38]  Béla Bollobás,et al.  Random Graphs , 1985 .

[39]  M. Teresa Pisabarro,et al.  SCOWLP update: 3D classification of protein-protein, -peptide, -saccharide and -nucleic acid interactions, and structure-based binding inferences across folds , 2011, BMC Bioinformatics.

[40]  Ioannis Xenarios,et al.  Mining literature for protein-protein interactions , 2001, Bioinform..

[41]  James I. Garrels,et al.  The Yeast Protein Database (YPD): a curated proteome database for Saccharomyces cerevisiae , 1998, Nucleic Acids Res..

[42]  Gary D. Bader,et al.  Cytoscape Web: an interactive web-based network browser , 2010, Bioinform..

[43]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[44]  Francesca D. Ciccarelli,et al.  Modification of Gene Duplicability during the Evolution of Protein Interaction Network , 2011, PLoS Comput. Biol..

[45]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[46]  K. Guimaraes,et al.  Predicting domain-domain interactions using a parsimony approach , 2006, Genome Biology.

[47]  Ozlem Keskin,et al.  A survey of available tools and web servers for analysis of protein-protein interactions and interfaces , 2008, Briefings Bioinform..

[48]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[49]  A. Wagner The yeast protein interaction network evolves rapidly and contains few redundant duplicate genes. , 2001, Molecular biology and evolution.

[50]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[51]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[52]  Albert Chan,et al.  PIPE: a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs , 2006, BMC Bioinformatics.

[53]  Jonathan D. G. Jones,et al.  Evidence for Network Evolution in an Arabidopsis Interactome Map , 2011, Science.

[54]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[55]  Gajendra P S Raghava,et al.  A simple approach for predicting protein-protein interactions. , 2010, Current protein & peptide science.

[56]  Huiru Zheng,et al.  Supervised Statistical and Machine Learning Approaches to Inferring Pairwise and Module-Based Protein Interaction Networks , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[57]  S. Wuchty Topology and weights in a protein domain interaction network – a novel way to predict protein interactions , 2006, BMC Genomics.

[58]  Alex W. Wilkinson,et al.  Computational prediction of protein-protein interactions , 2012 .

[59]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[60]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[61]  Leon Goldovsky,et al.  BioLayout(Java): versatile network visualisation of structural and functional relationships. , 2005, Applied bioinformatics.

[62]  Huan-Xiang Zhou,et al.  meta-PPISP: a meta web server for protein-protein interaction site prediction , 2007, Bioinform..

[63]  Dipanwita Roy Chowdhury,et al.  Human protein reference database as a discovery resource for proteomics , 2004, Nucleic Acids Res..

[64]  Joshua S Yuan,et al.  Plant Protein-Protein Interaction Network and Interactome , 2010, Current genomics.

[65]  D. Eisenberg,et al.  Localizing proteins in the cell from their phylogenetic profiles. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[67]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[68]  Z. Weng,et al.  Structure, function, and evolution of transient and obligate protein-protein interactions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[69]  Gesine Reinert,et al.  Predicting and Validating Protein Interactions Using Network Structure , 2008, PLoS Comput. Biol..

[70]  Gary D. Bader,et al.  BIND-a data specification for storing and describing biomolecular interactions, molecular complexes and pathways , 2000, Bioinform..

[71]  Werner Braun,et al.  InterProSurf: a web server for predicting interacting sites on protein surfaces , 2007, Bioinform..

[72]  Xiang-Sun Zhang,et al.  Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation , 2011, BMC Bioinformatics.

[73]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[74]  Peter Woollard,et al.  The minimum information required for reporting a molecular interaction experiment (MIMIx) , 2007, Nature Biotechnology.

[75]  M. Gerstein,et al.  Assessing the limits of genomic data integration for predicting protein networks. , 2005, Genome research.

[76]  Gary D. Bader,et al.  The Biomolecular Interaction Network Database in PSI-MI 2.5 , 2011, Database J. Biol. Databases Curation.

[77]  Martin Vingron,et al.  IntAct: an open source molecular interaction database , 2004, Nucleic Acids Res..

[78]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[79]  Xue-wen Chen,et al.  KUPS: constructing datasets of interacting and non-interacting protein pairs with associated attributions , 2010, Nucleic Acids Res..

[80]  Gary D Bader,et al.  PSICQUIC and PSISCORE: accessing and scoring molecular interactions , 2011, Nature Methods.

[81]  Ignacio Marín,et al.  Iterative Cluster Analysis of Protein Interaction Data , 2005, Bioinform..

[82]  Xiang Chen,et al.  The use of classification trees for bioinformatics , 2011, WIREs Data Mining Knowl. Discov..

[83]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[84]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[85]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[86]  Piero Fariselli,et al.  A neural network method to improve prediction of protein-protein interaction sites in heterocomplexes , 2003, 2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718).

[87]  Reinhard Schneider,et al.  Medusa: A tool for exploring and clustering biological networks , 2011, BMC Research Notes.

[88]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[89]  Manuela Helmer-Citterich,et al.  iSPOT: a web tool to infer the interaction specificity of families of protein modules , 2003, Nucleic Acids Res..

[90]  A. Valencia,et al.  Conserved Clusters of Functionally Related Genes in Two Bacterial Genomes , 1997, Journal of Molecular Evolution.

[91]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[92]  Xue-wen Chen,et al.  Heterogeneous data integration by tree‐augmented naïve Bayes for protein–protein interactions prediction , 2013, Proteomics.

[93]  Anna Panchenko,et al.  Protein-protein Interactions and Networks: Identification, Computer Analysis, and Prediction , 2008, Protein-protein Interactions and Networks.

[94]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[95]  Bonnie Berger,et al.  Struct2Net: a web service to predict protein–protein interactions using a structure-based approach , 2010, Nucleic Acids Res..

[96]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[97]  Reza Salavati,et al.  Sequence-based prediction of protein-protein interactions by means of codon usage , 2008, Genome Biology.

[98]  Igor Jurisica,et al.  NAViGaTOR: Network Analysis, Visualization and Graphing Toronto , 2009, Bioinform..

[99]  Johannes Goll,et al.  Protein interaction data curation: the International Molecular Exchange (IMEx) consortium , 2012, Nature Methods.

[100]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[101]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[102]  Hao Yu,et al.  Discovering patterns to extract protein-protein interactions from full texts , 2004, Bioinform..

[103]  Mark D'Souza,et al.  Use of contiguity on the chromosome to predict functional coupling , 1998, Silico Biol..

[104]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[105]  Kenji Satou,et al.  Extraction of knowledge on protein-protein interaction by association rule discovery , 2002, Bioinform..

[106]  Akash Ranjan,et al.  Effect of Reference Genome Selection on the Performance of Computational Methods for Genome-Wide Protein-Protein Interaction Prediction , 2012, PloS one.

[107]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[108]  Stijn van Dongen,et al.  Construction, Visualisation, and Clustering of Transcription Networks from Microarray Expression Data , 2007, PLoS Comput. Biol..

[109]  Yanzhi Guo,et al.  Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences , 2008, Nucleic acids research.

[110]  Mei Liu,et al.  Prediction of protein-protein interactions using random decision forest framework , 2005, Bioinform..

[111]  Aidong Zhang,et al.  A topological measurement for weighted protein interaction network , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[112]  Alfonso Valencia,et al.  Towards the prediction of protein interaction partners using physical docking , 2011, Molecular systems biology.

[113]  Robert L. Grossman,et al.  Flynet: a genomic resource for Drosophila melanogaster transcriptional regulatory networks , 2009, Bioinform..

[114]  Robert B. Russell,et al.  InterPreTS: protein Interaction Prediction through Tertiary Structure , 2003, Bioinform..

[115]  Emmanuel Barillot,et al.  BiNoM: a Cytoscape plugin for manipulating and analyzing biological networks , 2008, Bioinform..

[116]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[117]  Joel D. Martin,et al.  PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine , 2003, BMC Bioinformatics.

[118]  Rohit J. Kate,et al.  Comparative experiments on learning information extractors for proteins and their interactions , 2005, Artif. Intell. Medicine.

[119]  Ken A. Dill,et al.  Simulated Evolution of Protein-Protein Interaction Networks with Realistic Topology , 2012, PloS one.

[120]  Jiangning Song,et al.  Can simple codon pair usage predict protein-protein interaction? , 2012, Molecular bioSystems.

[121]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[122]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[123]  Teresa M. Przytycka,et al.  Predicting protein-protein interaction by searching evolutionary tree automorphism space , 2005, ISMB.

[124]  William Stafford Noble,et al.  Learning to predict protein-protein interactions from protein sequences , 2003, Bioinform..

[125]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[126]  Randall C Willis,et al.  Searching, viewing, and visualizing data in the Biomolecular Interaction Network Database (BIND). , 2006, Current protocols in bioinformatics.

[127]  Yu Zong Chen,et al.  prediction of protein-protein interactions , 2004 .

[128]  Menglong Li,et al.  PRED_PPI: a server for predicting protein-protein interactions based on sequence data with probability assignment , 2010, BMC Research Notes.

[129]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[130]  Aleksandar Stevanovic,et al.  Geometric Evolutionary Dynamics of Protein Interaction Networks , 2010, Pacific Symposium on Biocomputing.

[131]  R. Solé,et al.  Evolving protein interaction networks through gene duplication. , 2003, Journal of theoretical biology.

[132]  E. Marcotte,et al.  A flaw in the typical evaluation scheme for pair-input computational predictions , 2012, Nature Methods.

[133]  Miguel A. Andrade-Navarro,et al.  Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions , 1999, ISMB.

[134]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2009 update , 2009, Nucleic Acids Res..

[135]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[136]  Toshihisa Takagi,et al.  Automated extraction of information on protein-protein interactions from the biological literature , 2001, Bioinform..

[137]  ArnauVicente,et al.  Iterative Cluster Analysis of Protein Interaction Data , 2005 .

[138]  Yoshihide Hayashizaki,et al.  Construction of reliable protein-protein interaction networks with a new interaction generality measure , 2003, Bioinform..

[139]  T. M. Murali,et al.  Computational prediction of host-pathogen protein-protein interactions , 2007, ISMB/ECCB.

[140]  Cheng G. Weng,et al.  A New Evaluation Measure for Imbalanced Datasets , 2008, AusDM.

[141]  Athanasios K. Tsakalidis,et al.  Computational Approaches for the Prediction of Protein-Protein Interactions: A Survey , 2011 .

[142]  Yan Wang,et al.  VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology , 2009, Nucleic Acids Res..

[143]  Kyungsook Han,et al.  Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor. , 2010, Protein and peptide letters.

[144]  Yangchao Huang,et al.  Simple sequence-based kernels do not predict protein-protein interactions , 2010, Bioinform..

[145]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[146]  Ker-Chau Li,et al.  Human protein-protein interaction prediction by a novel sequence-based co-evolution method: co-evolutionary divergence , 2012, Bioinform..

[147]  Patrick Aloy,et al.  Interrogating protein interaction networks through structural biology , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[148]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[149]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[150]  B. Honig,et al.  Structure-based prediction of protein-protein interactions on a genome-wide scale , 2012, Nature.

[151]  M. Vidal,et al.  Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". , 2001, Genome research.

[152]  A. Valencia,et al.  High-confidence prediction of global interactomes based on genome-wide coevolutionary networks , 2008, Proceedings of the National Academy of Sciences.

[153]  Peter D. Karp,et al.  EcoCyc: a comprehensive database resource for Escherichia coli , 2004, Nucleic Acids Res..

[154]  Michael Schroeder,et al.  SCOWLP: a web-based database for detailed characterization and visualization of protein interfaces , 2006, BMC Bioinformatics.

[155]  Darby Tien-Hao Chang,et al.  Predicting protein-protein interactions in unbalanced data using the primary structure of proteins , 2010, BMC Bioinformatics.

[156]  Hao Yu,et al.  Discovering patterns to extract protein-protein interactions from the literature: Part II , 2005, Bioinform..

[157]  M. Sternberg,et al.  Prediction of protein-protein interactions by docking methods. , 2002, Current opinion in structural biology.

[158]  Quan Pan,et al.  Prediction of Protein-Protein Interaction Using Distance Frequency of Amino Acids Grouped with their Physicochemical Properties , 2011, 2011 Sixth International Conference on Bio-Inspired Computing: Theories and Applications.

[159]  Dietrich Rebholz-Schuhmann,et al.  Integrating protein-protein interactions and text mining for protein function prediction , 2008, BMC Bioinformatics.

[160]  Tamara Munzner,et al.  Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context , 2008, IEEE Transactions on Visualization and Computer Graphics.

[161]  Yoshihide Hayashizaki,et al.  Interaction generality, a measurement to assess the reliability of a protein-protein interaction. , 2002, Nucleic acids research.