Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction

Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.

[1]  Virapong Prachayasittikul,et al.  PAAP: a web server for predicting antihypertensive activity of peptides. , 2018, Future medicinal chemistry.

[2]  Hiroyuki Kurata,et al.  PreAIP: Computational Prediction of Anti-inflammatory Peptides by Integrating Multiple Complementary Features , 2019, Front. Genet..

[3]  Balachandran Manavalan,et al.  mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides , 2019, International journal of molecular sciences.

[4]  Erli Pang,et al.  Yeast protein-protein interaction binding sites: prediction from the motif-motif, motif-domain and domain-domain levels. , 2010, Molecular bioSystems.

[5]  Yu Xia,et al.  Domain-based prediction of the human isoform interactome provides insights into the functional impact of alternative splicing , 2017, PLoS Comput. Biol..

[6]  Renata Guerra-Sá,et al.  In silico Prediction of Protein–Protein Interaction Network Induced by Manganese II in Meyerozyma guilliermondii , 2020, Frontiers in Microbiology.

[7]  S. Teichmann,et al.  Structure, dynamics, assembly, and evolution of protein complexes. , 2015, Annual review of biochemistry.

[8]  Hiroyuki Kurata,et al.  i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation , 2020, Plant Molecular Biology.

[9]  Baldomero Oliva,et al.  iLoops: a protein-protein interaction prediction server based on structural features , 2013, Bioinform..

[10]  Crhisllane Rafaele dos Santos Vasconcelos,et al.  Building protein-protein interaction networks for Leishmania species through protein structural information , 2018, BMC Bioinformatics.

[11]  Shinn-Ying Ho,et al.  SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs , 2013, PloS one.

[12]  Sandra Romero-Molina,et al.  PPI‐Detect: A support vector machine model for sequence‐based prediction of protein–protein interactions , 2019, J. Comput. Chem..

[13]  Hiroyuki Kurata,et al.  Prediction of S-nitrosylation sites by integrating support vector machines and random forest. , 2019, Molecular omics.

[14]  Xing-Ming Zhao,et al.  PPIM : A protein-protein interaction database for Maize 11 12 , 2015 .

[15]  Yang Wang,et al.  Essential Protein Detection by Random Walk on Weighted Protein-Protein Interaction Networks , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  Mohammad Ganjtabesh,et al.  Improving protein complex prediction by reconstructing a high-confidence protein-protein interaction network of Escherichia coli from different physical interaction data sources , 2017, BMC Bioinformatics.

[17]  Yuh-Jyh Hu,et al.  Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme , 2019, BMC Bioinformatics.

[18]  Myeong Ok Kim,et al.  PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions , 2018, Front. Immunol..

[19]  Nazar Zaki,et al.  Improving the Detection of Protein Complexes by Predicting Novel Missing Interactome Links in the Protein-Protein Interaction Network , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[20]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[21]  J. Reifman,et al.  Influence of Protein Abundance on High-Throughput Protein-Protein Interaction Detection , 2009, PloS one.

[22]  Lan Huang,et al.  Profiling of Protein Interaction Networks of Protein Complexes Using Affinity Purification and Quantitative Mass Spectrometry* , 2010, Molecular & Cellular Proteomics.

[23]  Hiroyuki Kurata,et al.  Computational identification of microbial phosphorylation sites by the enhanced characteristics of sequence information , 2019, Scientific Reports.

[24]  Hiroyuki Kurata,et al.  A Comprehensive Review of In silico Analysis for Protein S-sulfenylation Sites. , 2018, Protein and peptide letters.

[25]  Virapong Prachayasittikul,et al.  Navigating the chemical space of dipeptidyl peptidase-4 inhibitors , 2015, Drug design, development and therapy.

[26]  Yang Guo,et al.  Protein–protein interaction network‐based detection of functionally similar proteins within species , 2012, Proteins.

[27]  Bogdan Istrate,et al.  Algorithmic approaches to protein-protein interaction site prediction , 2015, Algorithms for Molecular Biology.

[28]  Lu Wang,et al.  Protein–protein interaction networks and different clustering analysis in Burkitt’s lymphoma , 2018, Hematology.

[29]  Bindu Nanduri,et al.  HPIDB 2.0: a curated database for host–pathogen interactions , 2016, Database J. Biol. Databases Curation.

[30]  Hiroyuki Kurata,et al.  Computational identification of protein S-sulfenylation sites by incorporating the multiple sequence features information. , 2017, Molecular bioSystems.

[31]  Wei Chen,et al.  FCTP-WSRC: Protein–Protein Interactions Prediction via Weighted Sparse Representation Based Classification , 2020, Frontiers in Genetics.

[32]  Leyi Wei,et al.  Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation , 2019, Molecular therapy. Nucleic acids.

[33]  Hiroyuki Kurata,et al.  Large-Scale Assessment of Bioinformatics Tools for Lysine Succinylation Sites , 2019, Cells.

[34]  Hongjie Wu,et al.  DAMpred: Recognizing Disease-Associated nsSNPs through Bayes-Guided Neural-Network Model Built on Low-Resolution Structure Prediction of Proteins and Protein-Protein Interactions. , 2019, Journal of molecular biology.

[35]  Yuan Zhou,et al.  Critical assessment and performance improvement of plant‐pathogen protein‐protein interaction prediction methods , 2019, Briefings Bioinform..

[36]  Md. Nurul Haque Mollah,et al.  NTyroSite: Computational Identification of Protein Nitrotyrosine Sites Using Sequence Evolutionary Features , 2018, Molecules.

[37]  Mudita Singhal,et al.  A domain-based approach to predict protein-protein interactions , 2007, BMC Bioinformatics.

[38]  Zhilong Xiu,et al.  Protein–protein interaction network of the marine microalga Tetraselmis subcordiformis: prediction and application for starch metabolism analysis , 2014, Journal of Industrial Microbiology & Biotechnology.

[39]  Xiaopan Zhang,et al.  Prediction of Protein-Protein Interactions Based on Domain , 2019, Comput. Math. Methods Medicine.

[40]  Pei-Yu Wu,et al.  Detection of membrane protein–protein interaction in planta based on dual‐intein‐coupled tripartite split‐GFP association , 2018, The Plant journal : for cell and molecular biology.

[41]  Jinyan Li,et al.  Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs , 2015, PloS one.

[42]  Yong Zhou,et al.  Sequence-based Prediction of Protein-Protein Interactions Using Gray Wolf Optimizer–Based Relevance Vector Machine , 2019, Evolutionary bioinformatics online.

[43]  Nalini Schaduangrat,et al.  THPep: A machine learning-based approach for predicting tumor homing peptides , 2019, Comput. Biol. Chem..

[44]  Reza Ebrahimpour,et al.  PPIevo: protein-protein interaction prediction from PSSM based evolutionary information. , 2013, Genomics.

[45]  Luh Tung,et al.  Detection of Protein-Protein Interaction Within an RNA-Protein Complex Via Unnatural-Amino-Acid-Mediated Photochemical Crosslinking. , 2016, Methods in molecular biology.

[46]  Mathieu Blanchette,et al.  Detection of Locally Over-Represented GO Terms in Protein-Protein Interaction Networks , 2009, RECOMB.

[47]  En-Shiun Annie Lee,et al.  Prediction of Protein-Protein Interaction via co-occurring Aligned Pattern Clusters. , 2016, Methods.

[48]  Md. Nurul Haque Mollah,et al.  SuccinSite: a computational tool for the prediction of protein succinylation sites by exploiting the amino acid patterns and properties. , 2016, Molecular bioSystems.

[49]  Abdollah Dehzangi,et al.  iPHLoc-ES: Identification of bacteriophage protein locations using evolutionary and structural features. , 2017, Journal of theoretical biology.

[50]  Lucian Ilie,et al.  SPRINT: ultrafast protein-protein interaction prediction of the entire human interactome , 2017, BMC Bioinformatics.

[51]  Virapong Prachayasittikul,et al.  osFP: a web server for predicting the oligomeric states of fluorescent proteins , 2016, Journal of Cheminformatics.

[52]  Virapong Prachayasittikul,et al.  Meta-iAVP: A Sequence-Based Meta-Predictor for Improving the Prediction of Antiviral Peptides Using Effective Feature Representation , 2019, International journal of molecular sciences.

[53]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[54]  Behnam Neyshabur,et al.  Predicting protein‐protein interactions through sequence‐based deep learning , 2018, Bioinform..

[55]  Reza Ebrahimpour,et al.  LocFuse: human protein-protein interaction prediction via classifier fusion using protein localization information. , 2014, Genomics.

[56]  Xing Chen,et al.  EGBMMDA: Extreme Gradient Boosting Machine for MiRNA-Disease Association prediction , 2018, Cell Death & Disease.

[57]  Nazar Zaki,et al.  Detecting Protein Complexes in Protein Interaction Networks Modeled as Gene Expression Biclusters , 2015, PloS one.

[58]  Luhua Lai,et al.  Sequence-based prediction of protein protein interaction using a deep-learning algorithm , 2017, BMC Bioinformatics.

[59]  Leyi Wei,et al.  mAHTPred: a sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation , 2018, Bioinform..

[60]  Yasuo Tabei,et al.  Scalable Prediction of Compound‐protein Interaction on Compressed Molecular Fingerprints , 2020, Molecular informatics.

[61]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[62]  Martin H. Schaefer,et al.  HIPPIE v2.0: enhancing meaningfulness and reliability of protein–protein interaction networks , 2016, Nucleic Acids Res..

[63]  T. M. Murali,et al.  Computational prediction of host-pathogen protein-protein interactions , 2007, ISMB/ECCB.

[64]  Menglong Li,et al.  PRED_PPI: a server for predicting protein-protein interactions based on sequence data with probability assignment , 2010, BMC Research Notes.

[65]  Balachandran Manavalan,et al.  Machine intelligence in peptide therapeutics: A next‐generation tool for rapid disease screening , 2020, Medicinal research reviews.

[66]  Erich Bornberg-Bauer,et al.  The Evolution of Protein Interaction Networks in Regulatory Proteins , 2004, Comparative and functional genomics.

[67]  Balachandran Manavalan,et al.  Evolution of Machine Learning Algorithms in the Prediction and Design of Anticancer Peptides. , 2020, Current protein & peptide science.

[68]  Hiroyuki Kurata,et al.  GPSuc: Global Prediction of Generic and Species-specific Succinylation Sites by aggregating multiple sequence features , 2018, PloS one.

[69]  Kara Dolinski,et al.  The BioGRID interaction database: 2019 update , 2018, Nucleic Acids Res..

[70]  M. Gerstein,et al.  Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. , 2004, Genome research.

[71]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[72]  Chen Cao,et al.  Using discriminative vector machine model with 2DPCA to predict interactions among proteins , 2019, BMC Bioinformatics.

[73]  S. Orrù,et al.  Protein-protein interaction networks as a new perspective to evaluate distinct functional roles of voltage-dependent anion channel isoforms. , 2017, Molecular bioSystems.

[74]  Abdollah Dehzangi,et al.  iDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features , 2017, Scientific Reports.

[75]  Nalini Schaduangrat,et al.  HLPpred-Fuse: improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation , 2020, Bioinform..

[76]  Gwang Lee,et al.  AIPpred: Sequence-Based Prediction of Anti-inflammatory Peptides Using Random Forest , 2018, Front. Pharmacol..

[77]  Seiki Kuramitsu,et al.  DNA Binding and Protein-Protein Interaction Sites in MutS, a Mismatched DNA Recognition Protein from Thermus thermophilus HB8* , 2000, The Journal of Biological Chemistry.

[78]  Yuri Matsuzaki,et al.  MEGADOCK: An All-to-All Protein-Protein Interaction Prediction System Using Tertiary Structure Data , 2013, Protein and peptide letters.

[79]  Hongfei Lin,et al.  Detection of protein complexes from multiple protein interaction networks using graph embedding , 2019, Artif. Intell. Medicine.

[80]  Ji-Yong An,et al.  Highly accurate prediction of protein self-interactions by incorporating the average block and PSSM information into the general PseAAC. , 2017, Journal of theoretical biology.

[81]  Kenji Mizuguchi,et al.  Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators , 2014, BMC Bioinformatics.

[82]  Jiangning Song,et al.  Conditional random field approach to prediction of protein-protein interactions using domain information , 2011, BMC Systems Biology.

[83]  Kyubong Jo,et al.  FRET-based analysis of protein-nucleic acid interactions by genetically incorporating a fluorescent amino acid , 2014, Amino Acids.

[84]  MengChu Zhou,et al.  Highly Efficient Framework for Predicting Interactions Between Proteins , 2017, IEEE Transactions on Cybernetics.

[85]  Balachandran Manavalan,et al.  i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome. , 2019, International journal of biological macromolecules.

[86]  A. Pandey,et al.  Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis. , 2012, Molecular bioSystems.

[87]  Xing Chen,et al.  Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding , 2016, BMC Bioinformatics.

[88]  Hiroyuki Kurata,et al.  iLMS, Computational Identification of Lysine-Malonylation Sites by Combining Multiple Sequence Features , 2018, 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE).

[89]  Kara Dolinski,et al.  The BioGRID interaction database: 2017 update , 2016, Nucleic Acids Res..

[90]  Douglas R Storts,et al.  Protein-protein interaction studies on protein arrays: effect of detection strategies on signal-to-background ratios. , 2009, Analytical Biochemistry.

[91]  Dianjing Guo,et al.  A systematic identification of species-specific protein succinylation sites using joint element features information , 2017, International journal of nanomedicine.

[92]  Dmitry Korkin,et al.  DISPOT: a simple knowledge-based protein domain interaction statistical potential , 2019, Bioinform..

[93]  Chitra Subramanian,et al.  In vivo detection of protein-protein interaction in plant cells using BRET. , 2004, Methods in molecular biology.

[94]  Cesim Erten,et al.  SiPAN: simultaneous prediction and alignment of protein-protein interaction networks , 2015, Bioinform..

[95]  Burkhard Rost,et al.  Evolutionary profiles improve protein-protein interaction prediction from sequence , 2015, Bioinform..

[96]  Rebecca L Poole The TAIR database. , 2007, Methods in molecular biology.

[97]  Robert B. Russell,et al.  InterPreTS: protein Interaction Prediction through Tertiary Structure , 2003, Bioinform..

[98]  Jun Wang,et al.  Predicting protein-protein interactions using high-quality non-interacting pairs , 2018, BMC Bioinformatics.

[99]  Hiroyuki Kurata,et al.  Efficient computational model for identification of antitubercular peptides by integrating amino acid patterns and properties , 2019, FEBS letters.

[100]  Alfonso Valencia,et al.  Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions. , 2013, Molecular bioSystems.

[101]  Geoffrey I. Webb,et al.  DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites , 2019, Bioinform..

[102]  Lenwood S. Heath,et al.  DeNovo: virus-host sequence-based protein-protein interaction prediction , 2016, Bioinform..

[103]  Mohammad Ali Moni,et al.  Computational prediction of protein ubiquitination sites mapping on Arabidopsis thaliana , 2020, Comput. Biol. Chem..

[104]  J. Zhuang,et al.  Construction of a protein-protein interaction network of Wilms' tumor and pathway prediction of molecular complexes. , 2016, Genetics and molecular research : GMR.

[105]  Kyungsook Han,et al.  Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor. , 2010, Protein and peptide letters.

[106]  Xianyi Lian,et al.  Understanding Human-Virus Protein-Protein Interactions Using a Human Protein Complex-Based Analysis Framework , 2019, mSystems.

[107]  Leyi Wei,et al.  AtbPpred: A Robust Sequence-Based Prediction of Anti-Tubercular Peptides Using Extremely Randomized Trees , 2019, Computational and structural biotechnology journal.

[108]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[109]  Virapong Prachayasittikul,et al.  HemoPred: a web server for predicting the hemolytic activity of peptides. , 2017, Future medicinal chemistry.

[110]  Ujjwal Maulik,et al.  Computational Prediction of HCV-Human Protein-Protein Interaction via Topological Analysis of HCV Infected PPI Modules , 2018, IEEE Transactions on NanoBioscience.

[111]  R. Raz,et al.  ProMate: a structure based prediction program to identify the location of protein-protein binding sites. , 2004, Journal of molecular biology.

[112]  D. Liu,et al.  Biochemical and functional characterization of Epstein–Barr virus-encoded BARF1 protein: interaction with human hTid1 protein facilitates its maturation and secretion , 2006, Oncogene.

[113]  Hiroyuki Kurata,et al.  SIPMA: A Systematic Identification of Protein-Protein Interactions in Zea mays Using Autocorrelation Features in a Machine-Learning Framework , 2018, 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE).

[114]  Hao Lv,et al.  iRNA-m7G: Identifying N7-methylguanosine Sites by Fusing Multiple Features , 2019, Molecular therapy. Nucleic acids.

[115]  Wan Kyu Kim,et al.  Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. , 2002, Genome informatics. International Conference on Genome Informatics.

[116]  Lin Lu,et al.  Protein-protein interaction analysis to identify biomarker networks for endometriosis , 2017, Experimental and therapeutic medicine.

[117]  Balachandran Manavalan,et al.  Machine-Learning-Based Prediction of Cell-Penetrating Peptides and Their Uptake Efficiency with Improved Accuracy. , 2018, Journal of proteome research.

[118]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[119]  Javier De Las Rivas,et al.  APID database: redefining protein–protein interaction experimental evidences and binary interactomes , 2019, Database J. Biol. Databases Curation.

[120]  Jie Hu,et al.  Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools , 2019, Briefings Bioinform..

[121]  Md. Mehedi Hasan,et al.  Opinion Prediction of protein Post-Translational Modification sites: An overview , 2017 .

[122]  Richard M. Jackson,et al.  Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces , 2006, Bioinform..

[123]  Byeungwoo Jeon,et al.  A Network Hierarchy-Based method for functional module detection in protein-protein interaction networks. , 2018, Journal of theoretical biology.

[124]  Richard Wade-Martins,et al.  Protein-protein interaction networks identify targets which rescue the MPP+ cellular model of Parkinson’s disease , 2015, Scientific Reports.

[125]  Md. Nurul Haque Mollah,et al.  Improved Prediction of Protein-Protein Interaction Mapping on Homo Sapiens by Using Amino Acid Sequence Features in a Supervised Learning Framework. , 2020, Protein and peptide letters.

[126]  Jinyan Li,et al.  Sequence-based prediction of protein-protein interaction sites by simplified long short-term memory network , 2019, Neurocomputing.

[127]  Piyali Chatterjee,et al.  Protein function prediction from protein-protein interaction network using gene ontology based neighborhood analysis and physico-chemical features , 2018, J. Bioinform. Comput. Biol..

[128]  Xing Chen,et al.  Highly Accurate Prediction of Protein-Protein Interactions via Incorporating Evolutionary Information and Physicochemical Characteristics , 2016, International journal of molecular sciences.

[129]  Javier De Las Rivas,et al.  Protein–Protein Interactions Essentials: Key Concepts to Building and Analyzing Interactome Networks , 2010, PLoS Comput. Biol..

[130]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[131]  C. Pham,et al.  Detection of protein-protein interaction using bimolecular fluorescence complementation assay. , 2015, Methods in molecular biology.

[132]  Chen Fu,et al.  Machine-Learning-Based Predictor of Human-Bacteria Protein-Protein Interactions by Incorporating Comprehensive Host-Network Properties. , 2019, Journal of proteome research.

[133]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..