Protein Interface Complementarity and Gene Duplication Improve Link Prediction of Protein-Protein Interaction Network

Protein-protein interactions are the foundations of cellular life activities. At present, the already known protein-protein interactions only account for a small part of the total. With the development of experimental and computing technology, more and more PPI data are mined, PPI networks are more and more dense. It is possible to predict protein-protein interaction from the perspective of network structure. Although there are many high-throughput experimental methods to detect protein-protein interactions, the cost of experiments is high, time-consuming, and there is a certain error rate meanwhile. Network-based approaches can provide candidates of protein pairs for high-throughput experiments and improve the accuracy rate. This paper presents a new link prediction approach “Sim” for PPI networks from the perspectives of proteins' complementary interfaces and gene duplication. By integrating our approach “Sim” with the state-of-art network-based approach “L3,” the prediction accuracy and robustness are improved.

[1]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[2]  Cathy H. Wu,et al.  Evolutionary analysis and interaction prediction for protein-protein interaction network in geometric space , 2017, PloS one.

[3]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[4]  Zhu-Hong You,et al.  An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers , 2017, Neurocomputing.

[5]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Nengcheng Chen,et al.  Spatio-temporal enabled urban decision-making process modeling and visualization under the cyber-physical environment , 2015, Science China Information Sciences.

[7]  Paramvir S. Dehal,et al.  Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate , 2005, PLoS biology.

[8]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Ye Yuan,et al.  Link prediction via linear optimization , 2018, Physica A: Statistical Mechanics and its Applications.

[10]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[11]  Xing Chen,et al.  PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein–Protein Interactions from Protein Sequences , 2017, International journal of molecular sciences.

[12]  Albert-László Barabási,et al.  Network-based prediction of protein interactions , 2018, Nature Communications.

[13]  Xing Chen,et al.  Predicting protein-protein interactions from protein sequences by a stacked sparse autoencoder deep neural network. , 2017, Molecular bioSystems.

[14]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[15]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[16]  Elisenda Feliu,et al.  Understanding protein-protein interactions using local structural features. , 2013, Journal of molecular biology.

[17]  Lei Huang,et al.  Inference of protein-protein interaction networks from multiple heterogeneous data , 2015, EURASIP J. Bioinform. Syst. Biol..

[18]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[19]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[20]  James R Green,et al.  Reciprocal Perspective for Improved Protein-Protein Interaction Prediction , 2018, Scientific Reports.

[21]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[22]  Xiaohua Hu,et al.  HIV1-human protein-protein interaction prediction based on interface architecture similarity , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[23]  Yong Zhou,et al.  Computational methods using weighed-extreme learning machine to predict protein self-interactions with protein evolutionary information , 2017, Journal of Cheminformatics.

[24]  John R Yates,et al.  Identifying components of protein complexes in C. elegans using co-immunoprecipitation and mass spectrometry. , 2010, Journal of proteomics.

[25]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[26]  Hui Chen,et al.  A literature survey on smart cities , 2015, Science China Information Sciences.

[27]  Carlo Vittorio Cannistraci,et al.  Local-community network automata modelling based on length-three-paths for prediction of complex network structures in protein interactomes, food webs and more , 2018, bioRxiv.

[28]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[29]  S. Schreiber,et al.  Printing proteins as microarrays for high-throughput function determination. , 2000, Science.

[30]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[31]  Jinbo Xu,et al.  Raptorx: Exploiting structure information for protein alignment by statistical inference , 2011, Proteins.

[32]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[33]  Feng Chen,et al.  OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups , 2005, Nucleic Acids Res..

[34]  Mohammed AlQuraishi,et al.  AlphaFold at CASP13 , 2019, Bioinform..

[35]  H. Wolfson,et al.  Shape complementarity at protein–protein interfaces , 1994, Biopolymers.

[36]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[37]  Darby Tien-Hao Chang,et al.  Combining Phylogenetic Profiling-Based and Machine Learning-Based Techniques to Predict Functional Related Proteins , 2013, PloS one.

[38]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[39]  Sophia Tsoka,et al.  Prediction of protein interactions: metabolic enzymes are frequently involved in gene fusion , 2000, Nature Genetics.

[40]  Jianzhi Zhang Evolution by gene duplication: an update , 2003 .

[41]  Panagiotis Symeonidis,et al.  From biological to social networks: Link prediction based on multi-way spectral clustering , 2013, Data Knowl. Eng..

[42]  Peng Wang,et al.  Link prediction in social networks: the state-of-the-art , 2014, Science China Information Sciences.

[43]  P. Cuatrecasas,et al.  Protein purification by affinity chromatography. Derivatizations of agarose and polyacrylamide beads. , 1970, The Journal of biological chemistry.