Scalable multiple global network alignment for biological data

Advances in high-throughput technology has led to an increased amount of available data on protein-protein interaction (PPI) data. Detecting and extracting functional modules that are common across multiple networks is an important step towards understanding the role of functional modules and how they have evolved across species. A global protein-protein interaction network alignment algorithm attempts to find such functional orthologs across multiple networks. In this article, we propose a scalable global network alignment algorithm based on clustering methods and graph matching techniques in order to detect conserved interactions while simultaneously attempting to maximize the sequence similarity of nodes involved in the alignment. We present an algorithm for multiple alignments, in which several protein-protein interaction networks are aligned. We empirically evaluated our algorithm on several real biological datasets. We find that our approach offers a significant benefit both in terms of quality as well as speed over the state-of-the-art.

[1]  Roded Sharan,et al.  Fast and Accurate Alignment of Multiple Protein Networks , 2009, J. Comput. Biol..

[2]  C L Verlinde,et al.  Crystal structure of glycosomal glyceraldehyde-3-phosphate dehydrogenase from Leishmania mexicana: implications for structure-based drug design and a new position for the inorganic phosphate binding site. , 1995, Biochemistry.

[3]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[4]  Jennifer M. Rust,et al.  The BioGRID Interaction Database , 2011 .

[5]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[6]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[7]  Xin Guo,et al.  Domain-oriented edge-based alignment of protein interaction networks , 2009, Bioinform..

[8]  Srinivasan Parthasarathy,et al.  An ensemble framework for clustering protein-protein interaction networks , 2007, ISMB/ECCB.

[9]  Francis Bach,et al.  Global alignment of protein–protein interaction networks by graph matching methods , 2009, Bioinform..

[10]  Ying Wang,et al.  Algorithms for Large, Sparse Network Alignment Problems , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[11]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[12]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[13]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[14]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[15]  M. Zaslavskiy,et al.  A Path Following Algorithm for the Graph Matching Problem , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Mathieu Blanchette,et al.  Detection of Locally Over-Represented GO Terms in Protein-Protein Interaction Networks , 2009, RECOMB.

[17]  Serafim Batzoglou,et al.  Automatic Parameter Learning for Multiple Network Alignment , 2008, RECOMB.

[18]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[19]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[20]  G. Croes A Method for Solving Traveling-Salesman Problems , 1958 .

[21]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[22]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Shi-Hua Zhang,et al.  Alignment of molecular networks by integer quadratic programming , 2007, Bioinform..

[24]  Bonnie Berger,et al.  Local Optimization for Global Alignment of Protein Interaction Networks , 2010, Pacific Symposium on Biocomputing.

[25]  Detlef D. Leipe,et al.  Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are members of an ancient protein superfamily. , 1997, Nucleic acids research.

[26]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[27]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[28]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[29]  Gunnar W. Klau,et al.  A new graph-based method for pairwise global network alignment , 2009, BMC Bioinformatics.

[30]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Wojciech Szpankowski,et al.  Pairwise Alignment of Protein Interaction Networks , 2006, J. Comput. Biol..

[32]  Sourav Bandyopadhyay,et al.  Systematic identification of functional orthologs based on protein network comparison. , 2006, Genome research.