Top influencers can be identified universally by combining classical centralities

Information flow, opinion, and epidemics spread over structured networks. When using node centrality indicators to predict which nodes will be among the top influencers or superspreaders, no single centrality is a consistently good ranker across networks. We show that statistical classifiers using two or more centralities are instead consistently predictive over many diverse, static real-world topologies. Certain pairs of centralities cooperate particularly well in drawing the statistical boundary between the superspreaders and the rest: a local centrality measuring the size of a node’s neighbourhood gains from the addition of a global centrality such as the eigenvector centrality, closeness, or the core number. Intuitively, this is because a local centrality may rank highly nodes which are located in locally dense, but globally peripheral regions of the network. The additional global centrality indicator guides the prediction towards more central regions. The superspreaders usually jointly maximise the values of both centralities. As a result of the interplay between centrality indicators, training classifiers with seven classical indicators leads to a nearly maximum average precision function (0.995) across the networks in this study.

[1]  Duanbing Chen,et al.  Vital nodes identification in complex networks , 2016, ArXiv.

[2]  Jianping Fan,et al.  Ranking influential nodes in social networks based on node position and neighborhood , 2017, Neurocomputing.

[3]  Zhiming Zheng,et al.  Searching for superspreaders of information in real-world social media , 2014, Scientific Reports.

[4]  Doina Bucur,et al.  Beyond ranking nodes: Predicting epidemic outbreak sizes by network centralities , 2019, PLoS Comput. Biol..

[5]  Margarita S. Brose,et al.  Software and data , 2014 .

[6]  Hocine Cherifi,et al.  M-Centrality: identifying key nodes based on global position and local degree variation , 2018, Journal of Statistical Mechanics: Theory and Experiment.

[7]  Luciano da Fontoura Costa,et al.  The role of centrality for the identification of influential spreaders in complex networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Linyuan Lu,et al.  Network-based ranking in social systems: three challenges , 2020, Journal of Physics: Complexity.

[9]  Ming Tang,et al.  Core-like groups result in invalidation of identifying super-spreader by k-shell decomposition , 2014, Scientific Reports.

[10]  L. D. Costa,et al.  Identifying the starting point of a spreading process in complex networks. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Long Chen,et al.  Identifying influential nodes in complex networks based on global and local structure , 2020, Physica A: Statistical Mechanics and its Applications.

[12]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[13]  Qiang Guo,et al.  Ranking the spreading influence in complex networks , 2013, ArXiv.

[14]  Hui Gao,et al.  Identifying Influential Nodes in Large-Scale Directed Networks: The Role of Clustering , 2013, PloS one.

[15]  Claudio Castellano,et al.  Systematic comparison between methods for the detection of influential spreaders in complex networks , 2019, Scientific Reports.

[16]  Claudio Castellano,et al.  Leveraging percolation theory to single out influential spreaders in networks , 2016, Physical review. E.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Yizhou Sun,et al.  Finding key players in complex networks through deep reinforcement learning , 2020, Nature Machine Intelligence.

[19]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[20]  Jie Liu,et al.  An efficient hybrid tridiagonal divide-and-conquer algorithm on distributed memory architectures , 2016, J. Comput. Appl. Math..

[21]  Ming Tang,et al.  Numerical identification of epidemic thresholds for susceptible-infected-recovered model on finite-size networks , 2015, Chaos.

[22]  Hongming Mo,et al.  Evidential method to identify influential nodes in complex networks , 2015 .

[23]  Jurgen Kurths,et al.  A machine learning approach to predicting dynamical observables from network structure , 2019, ArXiv.

[24]  E. Jamroz,et al.  NMR-based metabolomics in pediatric drug resistant epilepsy – preliminary results , 2019, Scientific Reports.

[25]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[26]  Jiming Liu,et al.  Super-Spreader Identification Using Meta-Centrality , 2016, Scientific Reports.

[27]  Chao Li,et al.  Identification of influential spreaders based on classified neighbors in real-world complex networks , 2018, Appl. Math. Comput..

[28]  Evangelos E. Milios,et al.  A multi-centrality index for graph-based keyword extraction , 2019, Inf. Process. Manag..

[29]  Hua Yu,et al.  The node importance in actual complex networks based on a multi-attribute ranking method , 2015, Knowl. Based Syst..

[30]  Ming Tang,et al.  Identify influential spreaders in complex networks, the role of neighborhood , 2015, ArXiv.

[31]  Frank Schweitzer,et al.  A k-shell decomposition method for weighted networks , 2012, ArXiv.

[32]  Yong Deng,et al.  Identifying influential nodes in complex networks based on AHP , 2017 .

[33]  Paulo Shakarian,et al.  Spreaders in the Network SIR Model: An Empirical Study , 2012, ArXiv.

[34]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[35]  Ming Tang,et al.  Improving the accuracy of the k-shell method by removing redundant links: From a perspective of spreading dynamics , 2015, Scientific Reports.

[36]  Cheng Huang,et al.  A Machine Learning Based Framework for Identifying Influential Nodes in Complex Networks , 2020, IEEE Access.

[37]  R. May,et al.  Population biology of infectious diseases: Part I , 1979, Nature.

[38]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[39]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[40]  Fabian J Theis,et al.  Model-based analysis of response and resistance factors of cetuximab treatment in gastric cancer cell lines , 2020, PLoS computational biology.

[41]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[42]  Yi-Cheng Zhang,et al.  Leaders in Social Networks, the Delicious Case , 2011, PloS one.

[43]  An Zeng,et al.  Ranking spreaders by decomposing complex networks , 2012, ArXiv.

[44]  Duanbing Chen,et al.  Path diversity improves the identification of influential spreaders , 2013, ArXiv.

[45]  Lev Muchnik,et al.  Identifying influential spreaders in complex networks , 2010, 1001.5285.

[46]  R. May,et al.  Population biology of infectious diseases: Part II , 1979, Nature.