Increasing reliability of protein interactome by fast manifold embedding

Over the last decade, the development of high-throughput techniques has resulted in a rapid accumulation of protein-protein interaction (PPI) data. However, the high-throughput experimental interaction data is prone to exhibit high level of false-positive rates. It is therefore highly desirable to develop an approach to deal with these issues from the computational perspective. In this paper, we develop a robust computational technique for assessing the reliability of interactions by fast manifold embedding algorithm. A fast isometric feature mapping (fast-ISOMAP) is proposed to transform a PPI network into a low dimensional metric space, which recasts the problem of assessing protein interactions into the form of measuring similarity between points of its metric space. Then a reliability index (RI), a likelihood indicating the interaction of two proteins, is assigned to each protein pair in the PPI networks based on the similarity between the points in the embedding space. Validation of the proposed method is performed with extensive experiments on PPI networks of yeast. Results demonstrate that the interactions ranked top by our method have high functional homogeneity and localization coherence. Therefore, the proposed algorithm is a much more promising method to detect false positive interactions in PPI networks.

[1]  Christopher J. Rawlings,et al.  Assessing the functional coherence of modules found in multiple-evidence networks from Arabidopsis , 2011, BMC Bioinformatics.

[2]  Guimei Liu,et al.  Assessing and predicting protein interactions using both local and global network topological metrics. , 2008 .

[3]  Feiping Nie,et al.  Cauchy Graph Embedding , 2011, ICML.

[4]  Réka Albert,et al.  Conserved network motifs allow protein-protein interaction prediction , 2004, Bioinform..

[5]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[6]  Hongbin Zha,et al.  Riemannian Manifold Learning , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Mong-Li Lee,et al.  Discovering reliable protein interactions from high-throughput experimental data using network topology , 2005, Artif. Intell. Medicine.

[8]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[9]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[10]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[11]  Yoshihide Hayashizaki,et al.  Construction of reliable protein-protein interaction networks with a new interaction generality measure , 2003, Bioinform..

[12]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[13]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[14]  Joshua B. Tenenbaum,et al.  Global Versus Local Methods in Nonlinear Dimensionality Reduction , 2002, NIPS.

[15]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[16]  Alain Guénoche,et al.  Two local dissimilarity measures for weighted graphs with application to protein interaction networks , 2008, Adv. Data Anal. Classif..

[17]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[18]  Limsoon Wong,et al.  Author's Personal Copy Increasing the Reliability of Protein Interactomes , 2022 .

[19]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[20]  Zhu-Hong You,et al.  Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data , 2010, Bioinform..

[21]  Miguel Á. Carreira-Perpiñán,et al.  The Elastic Embedding Algorithm for Dimensionality Reduction , 2010, ICML.

[22]  Sean R. Collins,et al.  Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae*S , 2007, Molecular & Cellular Proteomics.

[23]  I. Hassan Embedded , 2005, The Cyber Security Handbook.

[24]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Martin Ester,et al.  Dense Graphlet Statistics of Protein Interaction and Random Networks , 2009, Pacific Symposium on Biocomputing.

[26]  Guimei Liu,et al.  Protein Interactome Analysis for Countering Pathogen Drug Resistance , 2010, Journal of Computer Science and Technology.

[27]  Giorgio Gallo,et al.  Shortest path algorithms , 1988, Handbook of Optimization in Telecommunications.

[28]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[29]  Mong-Li Lee,et al.  Increasing confidence of protein interactomes using network topological metrics , 2006, Bioinform..

[30]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, CVPR.

[31]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[33]  Mark Gerstein,et al.  Bridging structural biology and genomics: assessing protein interaction data with known complexes. , 2002, Trends in genetics : TIG.

[34]  Pierre Legrain,et al.  Biochemical Characterization of Protein Complexes from the Helicobacter pylori Protein Interaction Map , 2004, Molecular & Cellular Proteomics.

[35]  Desmond J. Higham,et al.  Fitting a geometric graph to a protein-protein interaction network , 2008, Bioinform..

[36]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[37]  D. Donoho,et al.  Hessian Eigenmaps : new locally linear embedding techniques for high-dimensional data , 2003 .

[38]  Athanasios K. Tsakalidis,et al.  Computational Approaches for the Prediction of Protein-Protein Interactions: A Survey , 2011 .

[39]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[40]  Yoshihide Hayashizaki,et al.  Interaction Generality, a Measurement to Assess the Reliability of a Protein-Protein Interaction , 2002 .