The Power of D-hops in Matching Power-Law Graphs

This paper studies seeded graph matching for power-law graphs. Assume that two edge-correlated graphs are independently edge-sampled from a common parent graph with a power-law degree distribution. A set of correctly matched vertex-pairs is chosen at random and revealed as initial seeds. Our goal is to use the seeds to recover the remaining latent vertex correspondence between the two graphs. Departing from the existing approaches that focus on the use of high-degree seeds in $1$-hop neighborhoods, we develop an efficient algorithm that exploits the low-degree seeds in suitably-defined D-hop neighborhoods. Specifically, we first match a set of vertex-pairs with appropriate degrees (which we refer to as the first slice) based on the number of low-degree seeds in their D-hop neighborhoods. This approach significantly reduces the number of initial seeds needed to trigger a cascading process to match the rest of graphs. Under the Chung-Lu random graph model with n vertices, max degree Θ(√n), and the power-law exponent 2 4-β/3-β, by optimally choosing the first slice, with high probability our algorithm can correctly match a constant fraction of the true pairs without any error, provided with only Ω((log n)4-β) initial seeds. Our result achieves an exponential reduction in the seed size requirement, as the best previously known result requires n1/2+ε seeds (for any small constant ε>0). Performance evaluation with synthetic and real data further corroborates the improved performance of our algorithm.

[1]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[2]  Guillermo Sapiro,et al.  Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching , 2013, NIPS.

[3]  Elchanan Mossel,et al.  Seeded graph matching via large neighborhood statistics , 2018, SODA.

[4]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Christoph Schnörr,et al.  Probabilistic Subgraph Matching Based on Convex Relaxation , 2005, EMMCVPR.

[6]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[7]  Matthias Grossglauser,et al.  Growing a Graph Matching from a Handful of Seeds , 2015, Proc. VLDB Endow..

[8]  Matthias Grossglauser,et al.  PROPER: global protein interaction network alignment through percolation matching , 2016, BMC Bioinformatics.

[9]  Tamara G. Kolda,et al.  The Similarity Between Stochastic Kronecker and Chung-Lu Graph Models , 2011, SDM.

[10]  Fan Chung Graham,et al.  The Average Distance in a Random Graph with Given Expected Degrees , 2004, Internet Math..

[11]  Michele Garetto,et al.  Social Network De-Anonymization Under Scale-Free User Relations , 2016, IEEE/ACM Transactions on Networking.

[12]  Matthias Grossglauser,et al.  On the performance of percolation graph matching , 2013, COSN '13.

[13]  Matthias Grossglauser,et al.  On the privacy of anonymized networks , 2011, KDD.

[14]  Terry Caelli,et al.  An eigenspace projection clustering method for inexact graph matching , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[16]  Elza Erkip,et al.  Seeded graph matching: Efficient algorithms and theoretical guarantees , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[17]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[18]  Tobias Friedrich,et al.  De-anonymization of Heterogeneous Random Graphs in Quasilinear Time , 2014, Algorithmica.

[19]  Xiaojun Lin,et al.  Graph Matching with Partially-Correct Seeds , 2020, J. Mach. Learn. Res..

[20]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.

[21]  Jiaming Xu,et al.  Spectral Graph Matching and Regularized Quadratic Relaxations: Algorithm and Theory , 2020, ICML.

[22]  Ulrik Brandes,et al.  What is network science? , 2013, Network Science.

[23]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[24]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[25]  Silvio Lattanzi,et al.  An efficient reconciliation algorithm for social networks , 2013, Proc. VLDB Endow..

[26]  R. Srikant,et al.  Correcting the Output of Approximate Graph Matching Algorithms , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[27]  Jianbo Shi,et al.  Balanced Graph Matching , 2006, NIPS.

[28]  F. Chung,et al.  Complex Graphs and Networks , 2006 .

[29]  Carey E. Priebe,et al.  Seeded graph matching for correlated Erdös-Rényi graphs , 2014, J. Mach. Learn. Res..

[30]  Carey E. Priebe,et al.  Seeded graph matching , 2012, Pattern Recognit..

[31]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[32]  Andrew Y. Ng,et al.  Robust Textual Inference via Graph Matching , 2005, HLT.

[33]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[34]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[35]  David Avis,et al.  A survey of heuristics for the weighted matching problem , 1983, Networks.