Spectral Alignment of Networks

Network alignment refers to the problem of nding a bijective mapping across vertices of two graphs to maximize the number of overlapping edges and/or to minimize the number of mismatched interactions across networks. This problem arises in many elds such as computational biology, social sciences and computer vision and is often cast as an expensive quadratic assignment problem (QAP). Although spectral methods have received signicant attention in dierent network science problems such as network clustering, the use of spectral techniques in the network alignment problem has been limited partially owing to the lack of principled connections between spectral methods and relaxations of the network alignment optimization. In this paper, we propose a network alignment framework that uses an orthogonal relaxation of the underlying QAP in a maximum weight bipartite matching optimization. Our method takes into account the ellipsoidal level sets of the quadratic objective function by exploiting eigenvalues and eigenvectors of (transformations of) adjacency graphs. Our framework not only can be employed to provide a theoretical justication for existing heuristic spectral network alignment methods, but it also leads to a new scalable network alignment algorithm which outperforms existing ones over various synthetic and real networks. Moreover, we generalize the objective function of the network alignment problem to consider both matched and mismatched interactions in a standard QAP formulation. This can be critical in applications where networks have low similarity and therefore we expect more mismatches than matches. We assess the eectiveness of our proposed method theoretically for certain classes of networks, through simulations over various synthetic network models, and in two real-data applications; in comparative analysis of gene regulatory networks across human, y and worm, and in user de-anonymization over twitter follower subgraphs.

[1]  Matthias Grossglauser,et al.  Growing a Graph Matching from a Handful of Seeds , 2015, Proc. VLDB Endow..

[2]  D. Ingber,et al.  High-Betweenness Proteins in the Yeast Protein Interaction Network , 2005, Journal of biomedicine & biotechnology.

[3]  S. Pillai,et al.  The Perron-Frobenius theorem: some of its applications , 2005, IEEE Signal Processing Magazine.

[4]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[5]  G. Evanno,et al.  Detecting the number of clusters of individuals using the software structure: a simulation study , 2005, Molecular ecology.

[6]  Christoph Schnörr,et al.  Probabilistic Subgraph Matching Based on Convex Relaxation , 2005, EMMCVPR.

[7]  Gos Micklem,et al.  Supporting Online Material Materials and Methods Figs. S1 to S50 Tables S1 to S18 References Identification of Functional Elements and Regulatory Circuits by Drosophila Modencode , 2022 .

[8]  Nathan W. Brixius,et al.  Solving quadratic assignment problems using convex quadratic programming relaxations , 2001 .

[9]  Ville Mustonen,et al.  GraphAlignment: Bayesian pairwise alignment of biological networks , 2012, BMC Systems Biology.

[10]  Xin Chen,et al.  TRANSFAC: an integrated system for gene expression regulation , 2000, Nucleic Acids Res..

[11]  Emeric Deutsch,et al.  On the first and second order derivatives of the Perron vector , 1985 .

[12]  Jiming Peng,et al.  A new relaxation framework for quadratic assignment problems based on matrix splitting , 2010, Math. Program. Comput..

[13]  Kurt Mehlhorn,et al.  Problems of Unknown Complexity: Graph isomorphism and Ramsey theoretic numbers , 2009 .

[14]  Shi-Hua Zhang,et al.  Alignment of molecular networks by integer quadratic programming , 2007, Bioinform..

[15]  László Babai,et al.  Canonical labelling of graphs in linear average time , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[16]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[17]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[18]  P. Wedin Perturbation bounds in connection with singular value decomposition , 1972 .

[19]  Ben Taskar,et al.  Word Alignment via Quadratic Assignment , 2006, NAACL.

[20]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[21]  László Babai,et al.  Graph isomorphism in quasipolynomial time [extended abstract] , 2016, STOC.

[22]  Maxim Sviridenko,et al.  Maximum Quadratic Assignment Problem: Reduction from Maximum Label Cover and LP-based Approximation Algorithm , 2010, TALG.

[23]  M. Newman Communities, modules and large-scale structure in networks , 2011, Nature Physics.

[24]  Franz Rendl,et al.  A Spectral Bundle Method for Semidefinite Programming , 1999, SIAM J. Optim..

[25]  Charlotte M. Deane,et al.  Functionally guided alignment of protein interaction networks for module detection , 2009, Bioinform..

[26]  A. Hopkins Network pharmacology: the next paradigm in drug discovery. , 2008, Nature chemical biology.

[27]  L. Kaufman,et al.  An algorithm for the quadratic assignment problem using Bender's decomposition , 1978 .

[28]  David J. Reiss,et al.  Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks , 2006, BMC Bioinformatics.

[29]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[30]  Henry Wolkowicz,et al.  On Lagrangian Relaxation of Quadratic Matrix Constraints , 2000, SIAM J. Matrix Anal. Appl..

[31]  Warren P. Adams,et al.  Improved Linear Programming-based Lower Bounds for the Quadratic Assignment Proglem , 1993, Quadratic Assignment and Related Problems.

[32]  Richard Bonneau,et al.  DREAM4: Combining Genetic and Dynamic Information to Identify Biological Networks and Dynamical Models , 2010, PloS one.

[33]  Barry Komm,et al.  Profiling of estrogen up- and down-regulated gene expression in human breast cancer cells: insights into gene networks and pathways underlying estrogenic control of proliferation and cell phenotype. , 2003, Endocrinology.

[34]  Péter Csermely,et al.  The efficiency of multi-target drugs: the network approach might help drug design. , 2004, Trends in pharmacological sciences.

[35]  Franz Rendl,et al.  Semidefinite Programming Relaxations for the Quadratic Assignment Problem , 1998, J. Comb. Optim..

[36]  Shinji Umeyama,et al.  An Eigendecomposition Approach to Weighted Graph Matching Problems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[38]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Jaap Heringa,et al.  Lagrangian Relaxation Applied to Sparse Global Network Alignment , 2011, PRIB.

[40]  E. Lawler The Quadratic Assignment Problem , 1963 .

[41]  Henryk Wozniakowski,et al.  Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start , 1992, SIAM J. Matrix Anal. Appl..

[42]  J. A. Bondy,et al.  Graph Theory with Applications , 1978 .

[43]  Fan Chung Graham,et al.  A Random Graph Model for Power Law Graphs , 2001, Exp. Math..

[44]  Carey E. Priebe,et al.  A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs , 2011, 1108.2228.

[45]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[46]  Franz Rendl,et al.  A New Lower Bound Via Projection for the Quadratic Assignment Problem , 1992, Math. Oper. Res..

[47]  Tai Qin,et al.  Regularized Spectral Clustering under the Degree-Corrected Stochastic Blockmodel , 2013, NIPS.

[48]  Hanif D. Sherali,et al.  On the Use of Exact and Heuristic Cutting Plane Methods for the Quadratic Assignment Problem , 1982 .

[49]  Jugal K. Kalita,et al.  A multiobjective memetic algorithm for PPI network alignment , 2015, Bioinform..

[50]  Nair Maria Maia de Abreu,et al.  A survey for the quadratic assignment problem , 2007, Eur. J. Oper. Res..

[51]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[52]  Muriel Médard,et al.  Network deconvolution as a general method to distinguish direct dependencies in networks , 2013, Nature Biotechnology.

[53]  S. Sell,et al.  Identification of intergenic trans-regulatory RNAs containing a disease-linked SNP sequence and targeting cell cycle progression/differentiation pathways in multiple common human disorders , 2009, Cell cycle.

[54]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[55]  Gunnar W. Klau,et al.  A new graph-based method for pairwise global network alignment , 2009, BMC Bioinformatics.

[56]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[57]  Natasa Przulj,et al.  L-GRAAL: Lagrangian graphlet-based network aligner , 2015, Bioinform..

[58]  Judith A. Blake,et al.  On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report , 2012, PLoS Comput. Biol..

[59]  Roded Sharan,et al.  NetworkBLAST: comparative analysis of protein networks , 2008 .

[60]  Manolis Kellis,et al.  Reliable prediction of regulator targets using 12 Drosophila genomes. , 2007, Genome research.

[61]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[62]  D. Goldfarb,et al.  ROW BY ROW METHODS FOR SEMIDEFINITE PROGRAMMING , 2009 .

[63]  Jugal K. Kalita,et al.  A comparison of algorithms for the pairwise alignment of biological networks , 2014, Bioinform..

[64]  J. Dow,et al.  Using FlyAtlas to identify better Drosophila melanogaster models of human disease , 2007, Nature Genetics.

[65]  M. Bazaraa,et al.  A branch-and-bound-based heuristic for solving the quadratic assignment problem , 1983 .

[66]  J. Renegar Efficient First-Order Methods for Linear Programming and Semidefinite Programming , 2014, 1409.5832.

[67]  John Whyte,et al.  A patient registry for cognitive rehabilitation research: a strategy for balancing patients' privacy rights with researchers' need for access. , 2005, Archives of physical medicine and rehabilitation.

[68]  Robert Preis,et al.  Linear Time 1/2-Approximation Algorithm for Maximum Weighted Matching in General Graphs , 1999, STACS.

[69]  Kishor S. Trivedi,et al.  An Aggregation Technique for the Transient Analysis of Stiff Markov Chains , 1986, IEEE Transactions on Computers.

[70]  Cornelia I Bargmann,et al.  Comparing genomic expression patterns across species identifies shared transcriptional profile in aging , 2004, Nature Genetics.

[71]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[72]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[73]  Ying Wang,et al.  Message-Passing Algorithms for Sparse Network Alignment , 2009, TKDD.

[74]  Tamer Kahveci,et al.  Accessed Terms of Use , 2022 .

[75]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[76]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[77]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[78]  R. Burkard Quadratic Assignment Problems , 1984 .

[79]  Yongtang Shi,et al.  Fifty years of graph matching, network alignment and network comparison , 2016, Inf. Sci..

[80]  Riet De Smet,et al.  Advantages and limitations of current network inference methods , 2010, Nature Reviews Microbiology.

[81]  Florent Krzakala,et al.  Spectral Clustering of graphs with the Bethe Hessian , 2014, NIPS.

[82]  P. Erdos,et al.  On the strength of connectedness of a random graph , 1964 .

[83]  D. West Introduction to Graph Theory , 1995 .

[84]  Robert Patro,et al.  Global network alignment using multiscale spectral signatures , 2012, Bioinform..

[85]  Francis Bach,et al.  Global alignment of protein–protein interaction networks by graph matching methods , 2009, Bioinform..

[86]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[87]  Paul Erdös,et al.  Random Graph Isomorphism , 1980, SIAM J. Comput..

[88]  J. Kuczy,et al.  Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start , 1992 .

[89]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[90]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .

[91]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2010, Nucleic Acids Res..

[92]  Jian Pei,et al.  A brief survey on anonymization techniques for privacy preserving publishing of social network data , 2008, SKDD.

[93]  Steven M. Gallo,et al.  REDfly v3.0: toward a comprehensive database of transcriptional regulatory elements in Drosophila , 2010, Nucleic Acids Res..

[94]  Edwin R. Hancock,et al.  Alignment using Spectral Clusters , 2002, BMVC.

[95]  Nicola J. Rinaldi,et al.  Computational discovery of gene modules and regulatory networks , 2003, Nature Biotechnology.

[96]  Gopal Pandurangan,et al.  Improved Random Graph Isomorphism Tomek Czajka , 2006 .

[97]  David J. Marchette,et al.  A central limit theorem for scaled eigenvectors of random dot product graphs , 2013, 1305.7388.

[98]  Edward M. Marcotte,et al.  Prediction of gene–phenotype associations in humans, mice, and plants using phenologs , 2013, BMC Bioinformatics.

[99]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[100]  Christoph Schnörr,et al.  Evaluation of a convex relaxation to a quadratic assignment matching approach for relational object views , 2007, Image Vis. Comput..

[101]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[102]  Stephen Tu,et al.  Practical first order methods for large scale semidefinite programming , 2014 .

[103]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[104]  Philip E. Bourne,et al.  Drug Discovery Using Chemical Systems Biology: Identification of the Protein-Ligand Binding Network To Explain the Side Effects of CETP Inhibitors , 2009, PLoS Comput. Biol..

[105]  Renato D. C. Monteiro,et al.  A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[106]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[107]  Peter J. Bickel,et al.  Comparative analysis of regulatory information and circuits across distant species , 2014, Nature.