InfMatch: Finding isomorphism subgraph on a big target graph based on the importance of vertex

Abstract Subgraph matching is an important research topic in the area of graph theory and it has been applied in many areas in nowdays. Filtering and verification are two main processes of subgraph matching algorithms. However, there exists many invalid nodes in candidate matching set after initializing the candidate set for each query node, which may result in a quantity of redundant computation during the filtering period. Regarding the problem mentioned above, in this paper, we propose a subgraph matching algorithm based on node influence, denoted as InfMatch, to improve the performance of subgraph matching on a large target graph. Specially, we find the central node of query graph by calculating the global and local influence value of each query node, after which candidate matching nodes for each query node are found from the neighborhood region of the candidate nodes for the central node. Since the central node we choose connects tightly with other nodes, isolated nodes can ′ t be added into the candidate matching set for central node and thus a number of unqualified candidate vertices are pruned. To further prune the unqualified candidate nodes, we propose several filter strategies according to the characteristics of our method. What ′ s more, considering edge limitation, we improve the matching order selection strategy. Extensive experiments demonstrate that our method is more efficient.

[1]  Ujjwal Maulik,et al.  Stability of Consensus Node Orderings Under Imperfect Network Data , 2016, IEEE Transactions on Computational Social Systems.

[2]  Tinghuai Ma,et al.  Graph classification based on graph set reconstruction and graph kernel feature reduction , 2018, Neurocomputing.

[3]  Young-Rae Cho,et al.  Alignment of PPI Networks Using Semantic Similarity for Conserved Protein Complex Prediction , 2016, IEEE Transactions on NanoBioscience.

[4]  Leandros Tassiulas,et al.  Hadoop MapReduce Performance on SSDs for Analyzing Social Networks , 2017, Big Data Res..

[5]  Lei Zou,et al.  Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs , 2018, IEEE Transactions on Knowledge and Data Engineering.

[6]  Bin Zhang,et al.  The optimization for recurring queries in big data analysis system with MapReduce , 2017, Future Gener. Comput. Syst..

[7]  Baharan Mirzasoleiman,et al.  Cascaded failures in weighted networks. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Mohamed Faten Zhani,et al.  PRISM: Fine-Grained Resource-Aware Scheduling for MapReduce , 2015, IEEE Transactions on Cloud Computing.

[10]  Benjamin I. P. Rubinstein,et al.  Principled Graph Matching Algorithms for Integrating Multiple Data Sources , 2014, IEEE Transactions on Knowledge and Data Engineering.

[11]  Kezan Li,et al.  A novel weight neighborhood centrality algorithm for identifying influential spreaders in complex networks , 2017 .

[12]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[13]  Zhiming Zheng,et al.  Searching for superspreaders of information in real-world social media , 2014, Scientific Reports.

[14]  Lun-Wei Ku,et al.  We Like, We Post: A Joint User-Post Approach for Facebook Post Stance Labeling , 2018, IEEE Transactions on Knowledge and Data Engineering.

[15]  Yuan Tian,et al.  PSPLPA: Probability and similarity based parallel label propagation algorithm on spark , 2018, Physica A: Statistical Mechanics and its Applications.

[16]  Evaggelia Pitoura,et al.  Top-k Durable Graph Pattern Queries on Temporal Graphs , 2019, IEEE Trans. Knowl. Data Eng..

[17]  Junhu Wang,et al.  Exploiting Vertex Relationships in Speeding up Subgraph Isomorphism over Large Graphs , 2015, Proc. VLDB Endow..

[18]  Kostas E. Psannis,et al.  Social networking data analysis tools & challenges , 2016, Future Gener. Comput. Syst..

[19]  Tinghuai Ma,et al.  Deep rolling: A novel emotion prediction model for a multi-participant communication context , 2019, Inf. Sci..

[20]  Quan Z. Sheng,et al.  Efficient pattern matching for graphs with multi-Labeled nodes , 2016, Knowl. Based Syst..

[21]  Pascal Fua,et al.  Geometric Graph Matching Using Monte Carlo Tree Search , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Qinbao Song,et al.  Label Propagation with α-Degree Neighborhood Impact for Network Community Detection , 2014, Comput. Intell. Neurosci..

[23]  Jiawei Han,et al.  On graph query optimization in large networks , 2010, Proc. VLDB Endow..

[24]  Yicheng Zhang,et al.  Identifying influential nodes in complex networks , 2012 .

[25]  Jeffrey Scott Vitter,et al.  Efficient Graph Similarity Search in External Memory , 2017, IEEE Access.

[26]  Jordi Torres,et al.  A Methodology for Spark Parameter Tuning , 2017, Big Data Res..

[27]  Vincenzo Bonnici,et al.  On the Variable Ordering in Subgraph Isomorphism Algorithms , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Yao Wang,et al.  LED: A fast overlapping communities detection algorithm based on structural clustering , 2016, Neurocomputing.

[29]  Jian Shen,et al.  $$\varvec{\textit{KDVEM}}$$KDVEM: a $$k$$k-degree anonymity with vertex and edge modification algorithm , 2015, Computing.

[30]  Tinghuai Ma,et al.  An efficient and scalable density-based clustering algorithm for datasets with complex structures , 2016, Neurocomputing.

[31]  Yi Pan,et al.  Predicting Essential Proteins Based on Weighted Degree Centrality , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[32]  Li Yan,et al.  Pattern match query over fuzzy RDF graph , 2019, Knowl. Based Syst..

[33]  Yong Deng,et al.  Identifying influential nodes in complex networks based on AHP , 2017 .

[34]  Xiao Zi-hong Research and Application on Crime Rule Based on Graph Data Mining Algorithm , 2011 .