REGAL: Representation Learning-based Graph Alignment

Problems involving multiple networks are prevalent in many scientific and other domains. In particular, network alignment, or the task of identifying corresponding nodes in different networks, has applications across the social and natural sciences. Motivated by recent advancements in node representation learning for single-graph tasks, we propose REGAL (REpresentation learning-based Graph ALignment), a framework that leverages the power of automatically-learned node representations to match nodes across different graphs. Within REGAL we devise xNetMF, an elegant and principled node embedding formulation that uniquely generalizes to multi-network problems. Our results demonstrate the utility and promise of unsupervised representation learning-based network alignment in terms of both speed and accuracy. REGAL runs up to 30x faster in the representation learning stage than comparable methods, outperforms existing network alignment methods by 20 to 30% accuracy on average, and scales to networks with millions of nodes each.

[1]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[2]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[3]  Michael W. Mahoney,et al.  Fast Randomized Kernel Ridge Regression with Statistical Guarantees , 2015, NIPS.

[4]  Mark Heimann,et al.  HashAlign: Hash-Based Alignment of Multiple Graphs , 2018, PAKDD.

[5]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[6]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[7]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[8]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[9]  Ying Wang,et al.  Message-Passing Algorithms for Sparse Network Alignment , 2009, TKDD.

[10]  Xiao Huang,et al.  Label Informed Attributed Network Embedding , 2017, WSDM.

[11]  Carey E. Priebe,et al.  Fast Inexact Graph Matching with Applications in Statistical Connectomics , 2011, ArXiv.

[12]  Danai Koutra,et al.  DELTACON: A Principled Massive-Graph Similarity Function , 2013, SDM.

[13]  Enhong Chen,et al.  Word Embedding Revisited: A New Representation Learning and Explicit Matrix Factorization Perspective , 2015, IJCAI.

[14]  Vandana,et al.  Survey of Nearest Neighbor Techniques , 2010, ArXiv.

[15]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[16]  Li Liu,et al.  Aligning Users across Social Networks Using Network Embedding , 2016, IJCAI.

[17]  Marc Plantevit,et al.  Mining Graph Topological Patterns: Finding Covariations among Vertex Descriptors , 2013, IEEE Transactions on Knowledge and Data Engineering.

[18]  Peter J. Bickel,et al.  The Earth Mover's distance is the Mallows distance: some insights from statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Danai Koutra,et al.  BIG-ALIGN: Fast Bipartite Graph Alignment , 2013, 2013 IEEE 13th International Conference on Data Mining.

[20]  M. Zaslavskiy,et al.  A Path Following Algorithm for the Graph Matching Problem , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Danai Koutra,et al.  RolX: structural role extraction & mining in large graphs , 2012, KDD.

[22]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[23]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[24]  Danai Koutra,et al.  Individual and Collective Graph Mining: Principles, Algorithms, and Applications , 2017, Individual and Collective Graph Mining.

[25]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[26]  Ryan A. Rossi,et al.  Role Discovery in Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[27]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[28]  Vipin Vijayan,et al.  Multiple Network Alignment via MultiMAGNA++ , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Mark Heimann On Generalizing Neural Node Embedding Methods to Multi-Network Problems , 2017 .

[30]  Danai Koutra,et al.  TimeCrunch: Interpretable Dynamic Graph Summarization , 2015, KDD.

[31]  Hanghang Tong,et al.  FINAL: Fast Attributed Network Alignment , 2016, KDD.

[32]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[33]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[34]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[35]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[36]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[37]  Jure Leskovec,et al.  Learning Structural Node Embeddings via Diffusion Wavelets , 2017, KDD.

[38]  Jonas Richiardi,et al.  Graph analysis of functional brain networks: practical issues in translational neuroscience , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[39]  Matthias Grossglauser,et al.  Growing a Graph Matching from a Handful of Seeds , 2015, Proc. VLDB Endow..

[40]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[41]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[42]  Gunnar W. Klau,et al.  A new graph-based method for pairwise global network alignment , 2009, BMC Bioinformatics.