Discriminative Distance-Based Network Indices and the Tiny-World Property

Distance-based indices, including closeness centrality, average path length, eccentricity and average eccentricity, are important tools for network analysis. In these indices, the distance between two vertices is measured by the size of shortest paths between them. However, this measure has shortcomings. A well-studied shortcoming is that extending it to disconnected graphs (and also directed graphs) is controversial. The second shortcoming is that when this measure is used in real-world networks, a huge number of vertices may have exactly the same closeness/eccentricity scores. The third shortcoming is that in many applications, the distance between two vertices not only depends on the size of shortest paths, but also on the number of shortest paths between them. In this paper, we develop a new distance measure between vertices of a graph that yields discriminative distance-based centrality indices. This measure is proportional to the size of shortest paths and inversely proportional to the number of shortest paths. We present algorithms for exact computation of the proposed discriminative indices. We then develop randomized algorithms that precisely estimate average discriminative path length and average discriminative eccentricity and show that they give $(\epsilon,\delta)$-approximations of these indices. Finally, we preform extensive experiments over several real-world networks from different domains and show that compared to the traditional indices, discriminative indices have usually much more discriminability. Our experiments reveal that real-world networks have usually a tiny average discriminative path length, bounded by a constant (e.g., 2). We refer to this property as the tiny-world property.

[1]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[2]  Yannick Rochat,et al.  Closeness Centrality Extended to Unconnected Graphs: the Harmonic Centrality Index , 2009 .

[3]  Paolo Avesani,et al.  Controversial Users Demand Local Trust Metrics: An Experimental Study on Epinions.com Community , 2005, AAAI.

[4]  Walter A. Kosters,et al.  Computing the Eccentricity Distribution of Large Graphs , 2013, Algorithms.

[5]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[6]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[7]  Benjamin Cornwell,et al.  A Complement-Derived Centrality Index for Disconnected Graphs 1 , 2005 .

[8]  Mihalis Yannakakis,et al.  High-probability parallel transitive closure algorithms , 1990, SPAA '90.

[9]  David C. Bell,et al.  Centrality measures for disease transmission networks , 1999, Soc. Networks.

[10]  Edith Cohen,et al.  Computing classic closeness centrality, at scale , 2014, COSN '14.

[11]  Reinhard Schneider,et al.  Using graph theory to analyze biological networks , 2011, BioData Mining.

[12]  Xiang-Yang Li,et al.  Ranking of Closeness Centrality for Large-Scale Social Networks , 2008, FAW.

[13]  Minas Gjoka,et al.  Walking in Facebook: A Case Study of Unbiased Sampling of OSNs , 2010, 2010 Proceedings IEEE INFOCOM.

[14]  Talal Rahwan,et al.  Closeness Centrality for Networks with Overlapping Community Structure , 2016, AAAI.

[15]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[16]  Julian Shun,et al.  An Evaluation of Parallel Eccentricity Estimation Algorithms on Undirected Real-World Graphs , 2015, KDD.

[17]  P. Dankelmann,et al.  The Average Eccentricity of a Graph and its Subgraphs , 2022 .

[18]  Alan G. Labouseur,et al.  Efficient top-k closeness centrality search , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[19]  Edith Cohen,et al.  Polylog-time and near-linear work approximation scheme for undirected shortest paths , 1994, STOC '94.

[20]  Ning Zhang,et al.  Fast approximation of average shortest path length of directed BA networks , 2017 .

[21]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[22]  Kevin Emamy,et al.  Citeulike: A Researcher's Social Bookmarking Service , 2007 .

[23]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[24]  Mihalis Yannakakis,et al.  High-Probability Parallel Transitive-Closure Algorithms , 1991, SIAM J. Comput..

[25]  Aristides Gionis,et al.  Mining Graph Evolution Rules , 2009, ECML/PKDD.

[26]  Damien Magoni,et al.  Analysis of the autonomous system network topology , 2001, CCRV.

[27]  Liam Roditty,et al.  Fast approximation algorithms for the diameter and radius of sparse graphs , 2013, STOC '13.

[28]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[29]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[30]  John Skvoretz,et al.  Node centrality in weighted networks: Generalizing degree and shortest paths , 2010, Soc. Networks.

[31]  Jimeng Sun,et al.  Centralities in Large Networks: Algorithms and Observations , 2011, SDM.

[32]  David Eppstein,et al.  Fast approximation of centrality , 2000, SODA '01.

[33]  Ulrik Brandes,et al.  Centrality Estimation in Large Networks , 2007, Int. J. Bifurc. Chaos.

[34]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[35]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[36]  Robert E. Tarjan,et al.  Better Approximation Algorithms for the Graph Diameter , 2014, SODA.

[37]  Andrea Marino,et al.  Computing top-k Closeness Centrality Faster in Unweighted Graphs , 2017, ALENEX.

[38]  Virginia Vassilevska Williams,et al.  Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.