Efficient Maximum Closeness Centrality Group Identification

As a key concept in the social networks, closeness centrality is widely adopted to measure the importance of a node. Many efficient algorithms are developed in the literature to find the top-k closeness centrality nodes. In most of the previous work, nodes are treated as irrelevant individuals for a top-k ranking. However, in many applications, it is required to find a set of nodes that is the most important as a group. In this paper, we extend the concept of closeness centrality to a set of nodes. We aim to find a set of k nodes that has the largest closeness centrality as a whole. We show that the problem is NP-hard, and prove that the objective function is monotonic and submodular. Therefore, the greedy algorithm can return a result with \(1-1/e\) approximation ratio. In order to handle large graphs, we propose a baseline sampling algorithm (BSA). We further improve the sampling approach by considering the order of samples and reducing the marginal gain update cost, which leads to our order based sampling algorithm (OSA). Finally, extensive experiments on four real world social networks demonstrate the efficiency and effectiveness of the proposed methods.

[1]  Alex Bavelas,et al.  Communication Patterns in Task‐Oriented Groups , 1950 .

[2]  Fabrizio Grandoni,et al.  Subcubic Equivalences between Graph Centrality Problems, APSP, and Diameter , 2015, SODA.

[3]  Edith Cohen,et al.  Computing classic closeness centrality, at scale , 2014, COSN '14.

[4]  Xiang-Yang Li,et al.  Ranking of Closeness Centrality for Large-Scale Social Networks , 2008, FAW.

[5]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[6]  Edith Cohen,et al.  All-Distances Sketches, Revisited: HIP Estimators for Massive Graphs Analysis , 2013, IEEE Transactions on Knowledge and Data Engineering.

[7]  Pierre Hansen,et al.  NP-hardness of Euclidean sum-of-squares clustering , 2008, Machine Learning.

[8]  Phillip Bonacich,et al.  Eigenvector centrality and structural zeroes and ones: When is a neighbor not a neighbor? , 2015, Soc. Networks.

[9]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[10]  David Eppstein,et al.  Fast approximation of centrality , 2000, SODA '01.

[11]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[12]  Edith Cohen,et al.  Metric Spaces: High Scalability with Tight Statistical Gua rantees , 2015 .

[13]  Piotr Indyk,et al.  Sublinear time algorithms for metric space problems , 1999, STOC '99.

[14]  Donald F. Towsley,et al.  Measuring and maximizing group closeness centrality over disk-resident graphs , 2014, WWW.

[15]  David D. Jensen,et al.  Graph clustering with network structure indices , 2007, ICML '07.

[16]  Mikkel Thorup Quick k-Median, k-Center, and Facility Location for Sparse Graphs , 2001, ICALP.

[17]  Alan G. Labouseur,et al.  Efficient top-k closeness centrality search , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[18]  Alex Bavelas A Mathematical Model for Group Structures , 1948 .