The Role of Graphlets in Viral Processes on Networks

Predicting the evolution of viral processes on networks is an important problem with applications arising in biology, the social sciences, and the study of the Internet. In existing works, mean-field analysis based upon degree distribution is used for the prediction of viral spreading across networks of different types. However, it has been shown that degree distribution alone fails to predict the behavior of viruses on some real-world networks and recent attempts have been made to use assortativity to address this shortcoming. In this paper, we show that adding assortativity does not fully explain the variance in the spread of viruses for a number of real-world networks. We propose using the graphlet frequency distribution in combination with assortativity to explain variations in the evolution of viral processes across networks with identical degree distribution. Using a data-driven approach by coupling predictive modeling with viral process simulation on real-world networks, we show that simple regression models based on graphlet frequency distribution can explain over 95% of the variance in virality on networks with the same degree distribution but different network topologies. Our results not only highlight the importance of graphlets but also identify a small collection of graphlets which may have the highest influence over the viral processes on a network.

[1]  Mohammad Al Hasan,et al.  GUISE: Uniform Sampling of Graphlets for Large Graph Analysis , 2012, 2012 IEEE 12th International Conference on Data Mining.

[2]  K. Dietz,et al.  Models for Vector-Borne Parasitic Diseases , 1980 .

[3]  Eva Nosal,et al.  Eigenvalues of graphs , 1970 .

[4]  Lu-Xing Yang,et al.  The Impact of the Network Topology on the Viral Prevalence: A Node-Based Approach , 2015, PloS one.

[5]  R. May,et al.  Infectious Diseases of Humans: Dynamics and Control , 1991, Annals of Internal Medicine.

[6]  Mohammad Al Hasan,et al.  Graft: An Efficient Graphlet Counting Method for Large Graph Analysis , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[8]  Konstantin Avrachenkov,et al.  Cooperative Game Theory Approaches for Network Partitioning , 2017, COCOON.

[9]  Donald F. Towsley,et al.  The effect of network topology on the spread of epidemics , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[10]  Le Song,et al.  Fake News Mitigation via Point Process Based Intervention , 2017, ICML.

[11]  D S Callaway,et al.  Network robustness and fragility: percolation on random graphs. , 2000, Physical review letters.

[12]  Mohammad Al Hasan,et al.  GUISE: a uniform sampler for constructing frequency histogram of graphlets , 2013, Knowledge and Information Systems.

[13]  Mohammad Al Hasan,et al.  Link Prediction in Dynamic Networks Using Graphlet , 2016, ECML/PKDD.

[14]  Vachik S. Dave,et al.  E-CLoG: Counting edge-centric local graphlets , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[15]  Amin Saberi,et al.  On the spread of viruses on the internet , 2005, SODA '05.

[16]  Jing Qu,et al.  Effects of random rewiring on the degree correlation of scale-free networks , 2015, Scientific reports.

[17]  Didier Sornette,et al.  Robust dynamic classes revealed by measuring the response function of a social system , 2008, Proceedings of the National Academy of Sciences.

[18]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[19]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[20]  Or Givan,et al.  Predicting epidemic thresholds on complex networks: limitations of mean-field approaches. , 2011, Journal of theoretical biology.

[21]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .

[22]  Mohammad Al Hasan,et al.  GRAFT: an approximate graphlet counting algorithm for large graph analysis , 2012, CIKM.

[23]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[24]  Christos Gkantsidis,et al.  The Markov Chain Simulation Method for Generating Connected Power Law Random Graphs , 2003, ALENEX.

[25]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[26]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[27]  P. Kaye Infectious diseases of humans: Dynamics and control , 1993 .

[28]  A. Barabasi,et al.  Halting viruses in scale-free networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Christos Faloutsos,et al.  Epidemic thresholds in real networks , 2008, TSEC.

[30]  Ryan A. Rossi,et al.  What if CLIQUE were fast? Maximum Cliques in Information Networks and Strong Components in Temporal Networks , 2012, ArXiv.

[31]  Mohammad Al Hasan,et al.  Finding Network Motifs Using MCMC Sampling , 2015, CompleNet.

[32]  B. Morton Fake news. , 2018, Marine pollution bulletin.

[33]  Sarika Jalan,et al.  Assortative and disassortative mixing investigated using the spectra of graphs. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  P. Van Mieghem,et al.  Influence of assortativity and degree-preserving rewiring on the spectra of networks , 2010 .

[35]  Jure Leskovec,et al.  SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity , 2015, KDD.

[36]  George E. Tita,et al.  Gang rivalry dynamics via coupled point process networks , 2014 .

[37]  Mohammad Al Hasan,et al.  Sampling Triples from Restricted Networks using MCMC Strategy , 2014, CIKM.

[38]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.