Discriminative Distance-Based Network Indices with Application to Link Prediction

In large networks, using the length of shortest paths as the distance measure has shortcomings. A well-studied shortcoming is that extending it to disconnected graphs and directed graphs is controversial. The second shortcoming is that a huge number of vertices may have exactly the same score. The third shortcoming is that in many applications, the distance between two vertices not only depends on the length of shortest paths, but also on the number of shortest paths. In this paper, first we develop a new distance measure between vertices of a graph that yields discriminative distance-based centrality indices. This measure is proportional to the length of shortest paths and inversely proportional to the number of shortest paths. We present algorithms for exact computation of the proposed discriminative indices. Second, we develop randomized algorithms that precisely estimate average discriminative path length and average discriminative eccentricity and show that they give $(\epsilon,\delta)$-approximations of these indices. Third, we perform extensive experiments over several real-world networks from different domains. In our experiments, we first show that compared to the traditional indices, discriminative indices have usually much more discriminability. Then, we show that our randomized algorithms can very precisely estimate average discriminative path length and average discriminative eccentricity, using only few samples. Then, we show that real-world networks have usually a tiny average discriminative path length, bounded by a constant (e.g., 2). Fourth, in order to better motivate the usefulness of our proposed distance measure, we present a novel link prediction method, that uses discriminative distance to decide which vertices are more likely to form a link in future, and show its superior performance compared to the well-known existing measures.

[1]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[2]  Liam Roditty,et al.  Fast approximation algorithms for the diameter and radius of sparse graphs , 2013, STOC '13.

[3]  Jimeng Sun,et al.  Centralities in Large Networks: Algorithms and Observations , 2011, SDM.

[4]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[5]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[6]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[7]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Manuel Mazzara,et al.  Link Prediction Using Top-k Shortest Distances , 2017, BICOD.

[9]  Ulrik Brandes,et al.  On variants of shortest-path betweenness centrality and their generic computation , 2008, Soc. Networks.

[10]  David Eppstein,et al.  Fast approximation of centrality , 2000, SODA '01.

[11]  Ulrik Brandes,et al.  Centrality Estimation in Large Networks , 2007, Int. J. Bifurc. Chaos.

[12]  Futian Wang,et al.  Measuring the robustness of link prediction algorithms under noisy environment , 2016, Scientific Reports.

[13]  Abishai Polus,et al.  A study of travel time and reliability on arterial routes , 1979 .

[14]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[15]  Kathleen M. Carley,et al.  Patterns and dynamics of users' behavior and interaction: Network analysis of an online community , 2009, J. Assoc. Inf. Sci. Technol..

[16]  Robert E. Tarjan,et al.  Better Approximation Algorithms for the Graph Diameter , 2014, SODA.

[17]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[18]  Feng Xia,et al.  Vehicular Social Networks: Enabling Smart Mobility , 2017, IEEE Communications Magazine.

[19]  Bambi Hu,et al.  Epidemic spreading in community networks , 2005 .

[20]  Edith Cohen,et al.  Computing classic closeness centrality, at scale , 2014, COSN '14.

[21]  Eduardo L. Pasiliao,et al.  Finding groups with maximum betweenness centrality , 2017, Optim. Methods Softw..

[22]  Jure Leskovec,et al.  Governance in Social Media: A Case Study of the Wikipedia Promotion Process , 2010, ICWSM.

[23]  Walter A. Kosters,et al.  Computing the Eccentricity Distribution of Large Graphs , 2013, Algorithms.

[24]  Morteza Haghir Chehreghani,et al.  Modeling Transitivity in Complex Networks , 2014, UAI.

[25]  V. Latora,et al.  Harmony in the Small-World , 2000, cond-mat/0008357.

[26]  Xiang-Yang Li,et al.  Ranking of Closeness Centrality for Large-Scale Social Networks , 2008, FAW.

[27]  John Skvoretz,et al.  Node centrality in weighted networks: Generalizing degree and shortest paths , 2010, Soc. Networks.

[28]  Damien Magoni,et al.  Analysis of the autonomous system network topology , 2001, CCRV.

[29]  Mostafa Haghir Chehreghani,et al.  An Efficient Algorithm for Approximate Betweenness Centrality Computation , 2013, Comput. J..

[30]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[31]  Minas Gjoka,et al.  Walking in Facebook: A Case Study of Unbiased Sampling of OSNs , 2010, 2010 Proceedings IEEE INFOCOM.

[32]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[33]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[34]  Kevin Emamy,et al.  Citeulike: A Researcher's Social Bookmarking Service , 2007 .

[35]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[36]  Fernando Berzal Galiano,et al.  A Survey of Link Prediction in Complex Networks , 2016, ACM Comput. Surv..

[37]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[38]  Reinhard Diestel,et al.  Graph Theory, 4th Edition , 2012, Graduate texts in mathematics.

[39]  Gary Chartrand,et al.  On eccentric vertices in graphs , 1996, Networks.

[40]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[41]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[42]  Frank Harary,et al.  Graph Theory , 2016 .

[43]  Benjamin Cornwell,et al.  A Complement-Derived Centrality Index for Disconnected Graphs 1 , 2005 .

[44]  Reinhard Schneider,et al.  Using graph theory to analyze biological networks , 2011, BioData Mining.

[45]  MengChu Zhou,et al.  A Cooperative Quality-Aware Service Access System for Social Internet of Vehicles , 2018, IEEE Internet of Things Journal.

[46]  Ian Palmer,et al.  Validating the results of a route choice simulator Transportation Research C 5 , 1997 .

[47]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[48]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[49]  Talel Abdessalem,et al.  Exact and Approximate Algorithms for Computing Betweenness Centrality in Directed Graphs , 2017, Fundam. Informaticae.

[50]  P. Dankelmann,et al.  The Average Eccentricity of a Graph and its Subgraphs , 2022 .

[51]  Yannick Rochat,et al.  Closeness Centrality Extended to Unconnected Graphs: the Harmonic Centrality Index , 2009 .

[52]  Justin Zhijun Zhan,et al.  Identification of top-K nodes in large networks using Katz centrality , 2017, Journal of Big Data.

[53]  Mihalis Yannakakis,et al.  High-probability parallel transitive closure algorithms , 1990, SPAA '90.

[54]  F. Ball,et al.  Epidemics with two levels of mixing , 1997 .

[55]  David C. Bell,et al.  Centrality measures for disease transmission networks , 1999, Soc. Networks.

[56]  S. Borgatti,et al.  The centrality of groups and classes , 1999 .

[57]  Andrea Marino,et al.  Computing top-k Closeness Centrality Faster in Unweighted Graphs , 2019, TKDD.

[58]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[59]  Xiangjie Kong,et al.  A Social-Aware Group Formation Framework for Information Diffusion in Narrowband Internet of Things , 2018, IEEE Internet of Things Journal.

[60]  Thore Husfeldt,et al.  Computing Graph Distances Parameterized by Treewidth and Diameter , 2017, IPEC.

[61]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[62]  Alan G. Labouseur,et al.  Efficient top-k closeness centrality search , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[63]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[64]  Virginia Vassilevska Williams,et al.  Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.

[65]  Lorenzo Pellis,et al.  Epidemic growth rate and household reproduction number in communities of households, schools and workplaces , 2011, Journal of mathematical biology.

[66]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[67]  Mihalis Yannakakis,et al.  High-Probability Parallel Transitive-Closure Algorithms , 1991, SIAM J. Comput..

[68]  Jure Leskovec,et al.  Motifs in Temporal Networks , 2016, WSDM.

[69]  Faraz Zaidi,et al.  Generating online social networks based on socio-demographic attributes , 2014, J. Complex Networks.

[70]  Paolo Avesani,et al.  Controversial Users Demand Local Trust Metrics: An Experimental Study on Epinions.com Community , 2005, AAAI.

[71]  Ning Zhang,et al.  Fast approximation of average shortest path length of directed BA networks , 2017 .

[72]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[73]  Mostafa Haghir Chehreghani Effective co-betweenness centrality computation , 2014, WSDM.

[74]  Julian Shun,et al.  An Evaluation of Parallel Eccentricity Estimation Algorithms on Undirected Real-World Graphs , 2015, KDD.

[75]  Aristides Gionis,et al.  Mining Graph Evolution Rules , 2009, ECML/PKDD.

[76]  Polylog-time and near-linear work approximation scheme for undirected shortest paths , 2000, JACM.

[77]  Talel Abdessalem,et al.  Upper and lower bounds for the q-entropy of network models with application to network model selection , 2017, Inf. Process. Lett..

[78]  Talal Rahwan,et al.  Closeness Centrality for Networks with Overlapping Community Structure , 2016, AAAI.