论文信息 - NED: An Inter-Graph Node Metric Based On Edit Distance

NED: An Inter-Graph Node Metric Based On Edit Distance

Node similarity is a fundamental problem in graph analytics. However, node similarity between nodes in different graphs (inter-graph nodes) has not received a lot of attention yet. The inter-graph node similarity is important in learning a new graph based on the knowledge of an existing graph (transfer learning on graphs) and has applications in biological, communication, and social networks. In this paper, we propose a novel distance function for measuring inter-graph node similarity with edit distance, called NED. In NED, two nodes are compared according to their local neighborhood structures which are represented as unordered k-adjacent trees, without relying on labels or other assumptions. Since the computation problem of tree edit distance on unordered trees is NP-Complete, we propose a modified tree edit distance, called TED*, for comparing neighborhood trees. TED* is a metric distance, as the original tree edit distance, but more importantly, TED* is polynomially computable. As a metric distance, NED admits efficient indexing, provides interpretable results, and shows to perform better than existing approaches on a number of data analysis tasks, including graph de-anonymization. Finally, the efficiency and effectiveness of NED are empirically demonstrated using real-world graphs.

[1] Nan Li,et al. Neighborhood based fast graph search in large networks , 2011, SIGMOD '11.

[2] Jure Leskovec,et al. {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[3] Natasa Przulj,et al. L-GRAAL: Lagrangian graphlet-based network aligner , 2015, Bioinform..

[4] Danai Koutra,et al. DeltaCon: Principled Massive-Graph Similarity Function with Attribution , 2016, ACM Trans. Knowl. Discov. Data.

[5] Ruoming Jin,et al. Axiomatic ranking of network role similarity , 2011, KDD.

[6] Anthony K. H. Tung,et al. Comparing Stars: On Approximating Graph Edit Distance , 2009, Proc. VLDB Endow..

[7] Prateek Mittal,et al. Graph Data Anonymization, De-Anonymization Attacks, and De-Anonymizability Quantification: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[8] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[9] Nikolaus Augsten,et al. RTED: A Robust Algorithm for the Tree Edit Distance , 2011, Proc. VLDB Endow..

[10] Ioannis Antonellis,et al. Simrank++: query rewriting through link analysis of the clickgraph (poster) , 2007, Proc. VLDB Endow..

[11] Henry G. Small,et al. Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[12] Ge Yu,et al. Efficiently Indexing Large Sparse Graphs for Similarity Search , 2012, IEEE Transactions on Knowledge and Data Engineering.

[13] Natasa Przulj,et al. Topology-function conservation in protein–protein interaction networks , 2015, Bioinform..

[14] Jérôme Kunegis,et al. KONECT: the Koblenz network collection , 2013, WWW.

[15] Kaspar Riesen,et al. Fast Suboptimal Algorithms for the Computation of Graph Edit Distance , 2006, SSPR/SPR.

[16] Charu C. Aggarwal,et al. NeMa: Fast Graph Search with Label Similarity , 2013, Proc. VLDB Endow..

[17] Christos Faloutsos,et al. It's who you know: graph mining using recursive structural features , 2011, KDD.

[18] Jugal K. Kalita,et al. A comparison of algorithms for the pairwise alignment of biological networks , 2014, Bioinform..

[19] Xiaowei Xu,et al. SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[20] Jennifer Widom,et al. SimRank: a measure of structural-context similarity , 2002, KDD.

[21] Tao Jiang,et al. Some MAX SNP-Hard Results Concerning Unordered Labeled Trees , 1994, Inf. Process. Lett..

[22] Hongyan Liu,et al. Measuring Similarity Based on Link Information: A Comparative Study , 2013, IEEE Transactions on Knowledge and Data Engineering.

[23] Christos Faloutsos,et al. Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[24] George Kollios,et al. NED: An Inter-Graph Node Metric on Edit Distance , 2016 .

[25] Igor Jurisica,et al. Modeling interactome: scale-free or geometric? , 2004, Bioinform..

[26] Xing Xie,et al. Effective Social Graph Deanonymization Based on Graph Structure and Descriptive Information , 2015, ACM Trans. Intell. Syst. Technol..

[27] H. White,et al. “Structural Equivalence of Individuals in Social Networks” , 2022, The SAGE Encyclopedia of Research Design.

[28] Kuo-Chung Tai,et al. The Tree-to-Tree Correction Problem , 1979, JACM.

[29] Kaspar Riesen,et al. Approximate graph edit distance computation by means of bipartite graph matching , 2009, Image Vis. Comput..

[30] George Danezis,et al. An Automated Social Graph De-anonymization Technique , 2014, WPES.

[31] Kurt Mehlhorn,et al. Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[32] Kaizhong Zhang,et al. On the Editing Distance Between Unordered Labeled Trees , 1992, Inf. Process. Lett..

[33] Danai Koutra,et al. Network similarity via multiple social theories , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[34] Kurt Mehlhorn,et al. Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[35] Christos Faloutsos,et al. oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[36] Jian Pei,et al. More is Simpler: Effectively and Efficiently Assessing Node-Pair Similarities Based on Hyperlinks , 2013, Proc. VLDB Endow..

[37] Philip S. Yu,et al. PathSim , 2011, Proc. VLDB Endow..

[38] Paul Van Dooren,et al. A MEASURE OF SIMILARITY BETWEEN GRAPH VERTICES . WITH APPLICATIONS TO SYNONYM EXTRACTION AND WEB SEARCHING , 2002 .