Evolutionary analysis and interaction prediction for protein-protein interaction network in geometric space

Prediction of protein-protein interaction (PPI) remains a central task in systems biology. With more PPIs identified, forming PPI networks, it has become feasible and also imperative to study PPIs at the network level, such as evolutionary analysis of the networks, for better understanding of PPI networks and for more accurate prediction of pairwise PPIs by leveraging the information gained at the network level. In this work we developed a novel method that enables us to incorporate evolutionary information into geometric space to improve PPI prediction, which in turn can be used to select and evaluate various evolutionary models. The method is tested with cross-validation using human PPI network and yeast PPI network data. The results show that the accuracy of PPI prediction measured by ROC score is increased by up to 14.6%, as compared to a baseline without using evolutionary information. The results also indicate that our modified evolutionary model DANEOsf—combining a gene duplication/neofunctionalization model and scale-free model—has a better fitness and prediction efficacy for these two PPI networks. The improved PPI prediction performance may suggest that our DANEOsf evolutionary model can uncover the underlying evolutionary mechanism for these two PPI networks better than other tested models. Consequently, of particular importance is that our method offers an effective way to select evolutionary models that best capture the underlying evolutionary mechanisms, evaluating the fitness of evolutionary models from the perspective of PPI prediction on real PPI networks.

[1]  Zhu-Hong You,et al.  Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data , 2010, Bioinform..

[2]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[3]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[4]  I. Ispolatov,et al.  Cliques and duplication–divergence network growth , 2005, New journal of physics.

[5]  Trey Ideker,et al.  Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes , 2010, Bioinform..

[6]  Zhu-Hong You,et al.  t-LSE: A Novel Robust Geometric Approach for Modeling Protein-Protein Interaction Networks , 2013, PloS one.

[7]  J. Kusak,et al.  Multilocus Detection of Wolf x Dog Hybridization in Italy, and Guidelines for Marker Selection , 2014, PloS one.

[8]  Gary D. Bader,et al.  An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology , 2010, BMC Bioinformatics.

[9]  R. Prim Shortest connection networks and some generalizations , 1957 .

[10]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[11]  Carlo Vittorio Cannistraci,et al.  Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding , 2013, Bioinform..

[12]  Hiroaki Kitano,et al.  Structure of Protein Interaction Networks and Their Implications on Drug Design , 2009, PLoS Comput. Biol..

[13]  Purvesh Khatri,et al.  Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments , 2004, Nucleic Acids Res..

[14]  Lei Huang,et al.  Evolutionary Model Selection and Parameter Estimation for Protein-Protein Interaction Network Based on Differential Evolution Algorithm , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  Desmond J. Higham,et al.  Geometric De-noising of Protein-Protein Interaction Networks , 2009, PLoS Comput. Biol..

[17]  Thomas Rattei,et al.  The Evolutionary Dynamics of Protein-Protein Interaction Networks Inferred from the Reconstruction of Ancient Networks , 2010, PloS one.

[18]  A. Vespignani,et al.  Modeling of Protein Interaction Networks , 2001, Complexus.

[19]  Desmond J. Higham,et al.  Fitting a geometric graph to a protein-protein interaction network , 2008, Bioinform..

[20]  Tijana Milenkoviæ,et al.  Uncovering Biological Network Function via Graphlet Degree Signatures , 2008, Cancer informatics.

[21]  I. Ispolatov,et al.  Duplication-divergence model of protein interaction network. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Ken A. Dill,et al.  Simulated Evolution of Protein-Protein Interaction Networks with Realistic Topology , 2012, PloS one.

[23]  Ryan W. Solava,et al.  Revealing Missing Parts of the Interactome via Link Prediction , 2014, PloS one.

[24]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[25]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[26]  Aidong Zhang,et al.  Protein Interaction Networks: Computational Analysis , 2009 .

[27]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[28]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.