GEDEVO: An Evolutionary Graph Edit Distance Algorithm for Biological Network Alignment

Introduction: With the so-called OMICS technology the scientific community has generated huge amounts of data that allow us to reconstruct the interplay of all kinds of biological entities. The emerging interaction networks are usually modeled as graphs with thousands of nodes and tens of thousands of edges between them. In addition to sequence alignment, the comparison of biological networks has proven great potential to infer the biological function of proteins and genes. However, the corresponding network alignment problem is computationally hard and theoretically intractable for real world instances. Results: We therefore developed GEDEVO, a novel tool for efficient graph comparison dedicated to real-world size biological networks. Underlying our approach is the so-called Graph Edit Distance (GED) model, where one graph is to be transferred into another one, with a minimal number of (or more general: minimal costs for) edge insertions and deletions. We present a novel evolutionary algorithm aiming to minimize the GED, and we compare our implementation against state of the art tools: SPINAL, GHOST, \CGRAAL, and \MIGRAAL. On a set of protein-protein interaction networks from different organisms we demonstrate that GEDEVO outperforms the current methods. It thus refines the previously suggested alignments based on topological information only. Conclusion: With GEDEVO, we account for the constantly exploding number and size of available biological networks. The software as well as all used data sets are publicly available at http://gedevo.mpi-inf.mpg.de.

[1]  Kuo-Chin Fan,et al.  Solving weighted graph matching problem by modified microgenetic algorithm , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[2]  Yasukazu Nakamura,et al.  A Large Scale Analysis of Protein–Protein Interactions in the Nitrogen-fixing Bacterium Mesorhizobium loti , 2008, DNA research : an international journal for rapid publication of reports on genes and genomes.

[3]  Ying Wang,et al.  Message-Passing Algorithms for Sparse Network Alignment , 2009, TKDD.

[4]  Yasukazu Nakamura,et al.  A Large-scale Protein–protein Interaction Analysis in Synechocystis sp. PCC6803 , 2007, DNA research : an international journal for rapid publication of reports on genes and genomes.

[5]  Sean R. Collins,et al.  Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae*S , 2007, Molecular & Cellular Proteomics.

[6]  J. Baumbach,et al.  On the power and limits of evolutionary conservation—unraveling bacterial gene regulatory networks , 2010, Nucleic acids research.

[7]  Gary D Bader,et al.  PSICQUIC and PSISCORE: accessing and scoring molecular interactions , 2011, Nature Methods.

[8]  Kaspar Riesen,et al.  Graph Edit Distance - Optimal and Suboptimal Algorithms with Applications. , 2009 .

[9]  Zhiyong Lu,et al.  Database resources of the National Center for Biotechnology Information , 2010, Nucleic Acids Res..

[10]  O. Kuchaiev,et al.  Simulating trait evolution for cross-cultural comparison , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[11]  Tijana Milenkoviæ,et al.  Uncovering Biological Network Function via Graphlet Degree Signatures , 2008, Cancer informatics.

[12]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[13]  Ahmet Emre Aladag,et al.  SPINAL: scalable protein interaction network alignment , 2013, Bioinform..

[14]  Lydia E. Kavraki,et al.  Computational challenges in systems biology , 2009, Comput. Sci. Rev..

[15]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[16]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[17]  Richard M. Karp,et al.  Detecting Disease-Specific Dysregulated Pathways Via Analysis of Clinical Expression Profiles , 2008, RECOMB.

[18]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[19]  Julie A. Hines,et al.  A proteome-wide protein interaction map for Campylobacter jejuni , 2007, Genome Biology.

[20]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[21]  Vesna Memisevic,et al.  Global G RAph A Lignment of Biological Networks , 2022 .

[22]  P. Radivojac,et al.  An integrated approach to inferring gene–disease associations in humans , 2008, Proteins.

[23]  Jaap Heringa,et al.  Lagrangian Relaxation Applied to Sparse Global Network Alignment , 2011, PRIB.

[24]  Chong Su,et al.  The Modular Organization of Protein Interactions in Escherichia coli , 2009, PLoS Comput. Biol..

[25]  Weng Leong Optimal network Alignment with Graphlet Degree Vectors , 2010 .

[26]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[27]  Gunnar W. Klau,et al.  A new graph-based method for pairwise global network alignment , 2009, BMC Bioinformatics.

[28]  Natasa Przulj,et al.  Integrative network alignment reveals large regions of global network similarity in yeast and human , 2011, Bioinform..

[29]  Ulrich Rückert,et al.  How Little Do We Actually Know? On the Size of Gene Regulatory Networks , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[30]  David E. Goldberg,et al.  Alleles, loci and the traveling salesman problem , 1985 .

[31]  Robert Patro,et al.  Global network alignment using multiscale spectral signatures , 2012, Bioinform..

[32]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .