Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach

Betweenness centrality (BC) is a widely used centrality measures for network analysis, which seeks to describe the importance of nodes in a network in terms of the fraction of shortest paths that pass through them. It is key to many valuable applications, including community detection and network dismantling. Computing BC scores on large networks is computationally challenging due to its high time complexity. Many sampling-based approximation algorithms have been proposed to speed up the estimation of BC. However, these methods still need considerable long running time on large-scale networks, and their results are sensitive to even small perturbation to the networks. In this paper, we focus on the efficient identification of top-k nodes with highest BC in a graph, which is an essential task to many network applications. Different from previous heuristic methods, we turn this task into a learning problem and design an encoder-decoder based framework as a solution. Specifically, the encoder leverages the network structure to represent each node as an embedding vector, which captures the important structural information of the node. The decoder transforms each embedding vector into a scalar, which identifies the relative rank of a node in terms of its BC. We use the pairwise ranking loss to train the model to identify the orders of nodes regarding their BC. By training on small-scale networks, the model is capable of assigning relative BC scores to nodes for much larger networks, and thus identifying the highly-ranked nodes. Experiments on both synthetic and real-world networks demonstrate that, compared to existing baselines, our model drastically speeds up the prediction without noticeable sacrifice in accuracy, and even outperforms the state-of-the-arts in terms of accuracy on several large real-world networks.

[1]  Kathleen M. Carley,et al.  k-Centralities: local approximations of global measures based on shortest paths , 2012, WWW.

[2]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[3]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[4]  Torsten Suel,et al.  Estimating pairwise distances in large graphs , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[5]  Tie-Yan Liu Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..

[6]  Yuichi Yoshida,et al.  Almost linear-time algorithms for adaptive betweenness centrality using hypergraph sketches , 2014, KDD.

[7]  Eli Upfal,et al.  ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with Rademacher Averages , 2016, KDD.

[8]  Panos Kalnis,et al.  A Benchmark for Betweenness Centrality Approximation Algorithms on Large Graphs , 2017, SSDBM.

[9]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[10]  Changjun Fan,et al.  An efficient link prediction index for complex military organization , 2017 .

[11]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[12]  Yizhou Sun,et al.  Graph Edit Distance Computation via Graph Neural Networks , 2018, ArXiv.

[13]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[14]  Radu Grosu,et al.  Compressive sensing of high betweenness centrality nodes in networks , 2018 .

[15]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[16]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[17]  Tiago P. Peixoto,et al.  The graph-tool python library , 2014 .

[18]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[19]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[20]  Christian Staudt,et al.  NetworKit: A tool suite for large-scale complex network analysis , 2014, Network Science.

[21]  Changjun Fan,et al.  A fuzzy clustering algorithm to detect criminals without prior information , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[22]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[23]  Steven Skiena,et al.  Enhanced Network Embeddings via Exploiting Edge Labels , 2018, CIKM.

[24]  Henning Meyerhenke,et al.  Approximating Betweenness Centrality in Fully Dynamic Networks , 2015, Internet Math..

[25]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[26]  Guilhem Semerjian,et al.  Network dismantling , 2016, Proceedings of the National Academy of Sciences.

[27]  Yizhou Sun,et al.  SimGNN: A Neural Network Approach to Fast Graph Similarity Computation , 2018, WSDM.

[28]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[29]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[30]  Jure Leskovec,et al.  Learning Structural Node Embeddings via Diffusion Wavelets , 2017, KDD.

[31]  Beom Jun Kim,et al.  Attack vulnerability of complex networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[33]  Eli Upfal,et al.  Scalable Betweenness Centrality Maximization via Sampling , 2016, KDD.

[34]  Michele Borassi,et al.  KADABRA is an ADaptive Algorithm for Betweenness via Random Approximation , 2016, ESA.

[35]  Beom Jun Kim,et al.  Growing scale-free networks with tunable clustering. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[37]  Chin-Wan Chung,et al.  Finding k-highest betweenness centrality vertices in graphs , 2014, WWW.

[38]  Martin Everett,et al.  Ego network betweenness , 2005, Soc. Networks.

[39]  Yizhou Sun,et al.  Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification , 2016, WSDM.

[40]  Adriana Iamnitchi,et al.  Identifying high betweenness centrality nodes in large social networks , 2012, Social Network Analysis and Mining.

[41]  Evgenios M. Kornaropoulos,et al.  Fast approximation of betweenness centrality through sampling , 2014, WSDM.

[42]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Viktor K. Prasanna,et al.  Efficient extraction of high centrality vertices in distributed graphs , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).