HubAlign: an accurate and efficient method for global alignment of protein–protein interaction networks

Motivation: High-throughput experimental techniques have produced a large amount of protein–protein interaction (PPI) data. The study of PPI networks, such as comparative analysis, shall benefit the understanding of life process and diseases at the molecular level. One way of comparative analysis is to align PPI networks to identify conserved or species-specific subnetwork motifs. A few methods have been developed for global PPI network alignment, but it still remains challenging in terms of both accuracy and efficiency. Results: This paper presents a novel global network alignment algorithm, denoted as HubAlign, that makes use of both network topology and sequence homology information, based upon the observation that topologically important proteins in a PPI network usually are much more conserved and thus, more likely to be aligned. HubAlign uses a minimum-degree heuristic algorithm to estimate the topological and functional importance of a protein from the global network topology information. Then HubAlign aligns topologically important proteins first and gradually extends the alignment to the whole network. Extensive tests indicate that HubAlign greatly outperforms several popular methods in terms of both accuracy and efficiency, especially in detecting functionally similar proteins. Availability: HubAlign is available freely for non-commercial purposes at http://ttic.uchicago.edu/∼hashemifar/software/HubAlign.zip Contact: jinboxu@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Behnam Neyshabur,et al.  NETAL: a new graph-based method for global alignment of protein-protein interaction networks , 2013, Bioinform..

[2]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[3]  Thomas Lengauer,et al.  A new measure for functional similarity of gene products based on Gene Ontology , 2006, BMC Bioinformatics.

[4]  Tijana Milenkovic,et al.  MAGNA: Maximizing Accuracy in Global Network Alignment , 2013, Bioinform..

[5]  Lin Gao,et al.  Seed selection strategy in global network alignment without destroying the entire structures of functional modules , 2012, Proteome Science.

[6]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[8]  Paul D. Seymour,et al.  Graph minors. III. Planar tree-width , 1984, J. Comb. Theory B.

[9]  O. Kuchaiev,et al.  Simulating trait evolution for cross-cultural comparison , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[10]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[11]  Thomas E Rohan Proteomic Prediction of Breast Cancer Risk: A Cohort Study , 2009 .

[12]  Serafim Batzoglou,et al.  Automatic Parameter Learning for Multiple Network Alignment , 2008, RECOMB.

[13]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[14]  Bonnie Berger,et al.  Global Alignment of Multiple Protein Interaction Networks , 2008, Pacific Symposium on Biocomputing.

[15]  Frank Dudbridge,et al.  The Use of Edge-Betweenness Clustering to Investigate Biological Function in Protein Interaction Networks , 2005, BMC Bioinformatics.

[16]  Natasa Przulj,et al.  Integrative network alignment reveals large regions of global network similarity in yeast and human , 2011, Bioinform..

[17]  Kayarkar Na,et al.  Protein networks in diseases , 2009 .

[18]  Dianne P. O'Leary,et al.  Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality , 2008, PLoS Comput. Biol..

[19]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[20]  Matthias Mann,et al.  Mass spectrometry–based proteomics in cell biology , 2010, The Journal of cell biology.

[21]  Robert Patro,et al.  Global network alignment using multiscale spectral signatures , 2012, Bioinform..

[22]  Julie A. Hines,et al.  A proteome-wide protein interaction map for Campylobacter jejuni , 2007, Genome Biology.

[23]  Cheng-Yu Ma,et al.  Optimizing a global alignment of protein interaction networks , 2013, Bioinform..

[24]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[25]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[26]  Wojciech Szpankowski,et al.  Pairwise Alignment of Protein Interaction Networks , 2006, J. Comput. Biol..

[27]  Jim Hefferon,et al.  Linear Algebra , 2012 .

[28]  M. Cannataro,et al.  AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology , 2012, PloS one.

[29]  Chong Su,et al.  The Modular Organization of Protein Interactions in Escherichia coli , 2009, PLoS Comput. Biol..

[30]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[31]  E. Wang,et al.  Genetic studies of diseases , 2007, Cellular and Molecular Life Sciences.

[32]  Jing Zhao,et al.  Complex networks theory for analyzing metabolic networks , 2006, q-bio/0603015.

[33]  Arie M. C. A. Koster,et al.  Treewidth computations I. Upper bounds , 2010, Inf. Comput..

[34]  Sean R. Collins,et al.  Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae*S , 2007, Molecular & Cellular Proteomics.

[35]  Paul D. Seymour,et al.  Graph Minors: XV. Giant Steps , 1996, J. Comb. Theory, Ser. B.

[36]  Donggang Liu Protecting Neighbor Discovery Against Node Compromises in Sensor Networks , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[37]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[38]  F. Spieksma,et al.  Effective graph resistance , 2011 .

[39]  Mark Gerstein,et al.  The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics , 2007, PLoS Comput. Biol..

[40]  P. Radivojac,et al.  An integrated approach to inferring gene–disease associations in humans , 2008, Proteins.