Community detection, link prediction, and layer interdependence in multilayer networks

Complex systems are often characterized by distinct types of interactions between the same entities. These can be described as a multilayer network where each layer represents one type of interaction. These layers may be interdependent in complicated ways, revealing different kinds of structure in the network. In this work we present a generative model, and an efficient expectation-maximization algorithm, which allows us to perform inference tasks such as community detection and link prediction in this setting. Our model assumes overlapping communities that are common between the layers, while allowing these communities to affect each layer in a different way, including arbitrary mixtures of assortative, disassortative, or directed structure. It also gives us a mathematically principled way to define the interdependence between layers, by measuring how much information about one layer helps us predict links in another layer. In particular, this allows us to bundle layers together to compress redundant information and identify small groups of layers which suffice to predict the remaining layers accurately. We illustrate these findings by analyzing synthetic data and two real multilayer networks, one representing social support relationships among villagers in South India and the other representing shared genetic substring material between genes of the malaria parasite.

[1]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[2]  Dane Taylor,et al.  Enhanced detectability of community structure in multilayer networks through layer aggregation , 2015, Physical review letters.

[3]  Vito Latora,et al.  Structural reducibility of multilayer networks , 2015, Nature Communications.

[4]  Dane Taylor,et al.  Clustering Network Layers with the Strata Multilayer Stochastic Block Model , 2015, IEEE Transactions on Network Science and Engineering.

[5]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.

[6]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Jukka-Pekka Onnela,et al.  Community Structure in Time-Dependent, Multiscale, and Multiplex Networks , 2009, Science.

[8]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[9]  Yuguo Chen,et al.  Null Models and Modularity Based Community Detection in Multi-Layer Networks , 2016, ArXiv.

[10]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[11]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[12]  L. Christophorou Science , 2018, Emerging Dynamics: Science, Energy, Society and Values.

[14]  J. Cavanaugh Biostatistics , 2005, Definitions.

[15]  Günter Altner,et al.  The Nature of human behaviour , 1976 .

[16]  H. J. Mclaughlin,et al.  Learn , 2002 .

[17]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[18]  K. Emrith,et al.  Computational Intelligence and Neuroscience in Neurorobotics , 2019, Comput. Intell. Neurosci..

[19]  Ali Taylan Cemgil,et al.  Bayesian Inference for Nonnegative Matrix Factorisation Models , 2009, Comput. Intell. Neurosci..

[20]  Rajmonda S. Caceres,et al.  Detectability of small communities in multilayer and temporal networks: Eigenvector localization, layer aggregation, and time series discretization , 2016, ArXiv.

[21]  J. Herskowitz,et al.  Proceedings of the National Academy of Sciences, USA , 1996, Current Biology.

[22]  Ram Ramamoorthy,et al.  Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence , 2014 .

[23]  Caroline O. Buckee,et al.  A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes , 2013, PLoS Comput. Biol..

[24]  J. Ramasco,et al.  Inversion method for content-based networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[26]  Miguel Romance,et al.  Eigenvector centrality of nodes in multiplex networks , 2013, Chaos.

[27]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[28]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[29]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[30]  Kilian Q. Weinberger,et al.  Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 , 2016 .

[31]  J. Rogers Chaos , 1876 .

[32]  Mingyuan Zhou,et al.  Infinite Edge Partition Models for Overlapping Community Detection and Link Prediction , 2015, AISTATS.

[33]  Aaron Schein,et al.  Inferring Polyadic Events With Poisson Tensor Factorization , 2014 .

[34]  Eleanor A. Power,et al.  Discerning devotion: Testing the signaling theory of religion , 2017 .

[35]  D. Dunson,et al.  Bayesian latent variable models for mixed discrete outcomes. , 2005, Biostatistics.

[36]  James P. Peerenboom,et al.  Identifying, understanding, and analyzing critical infrastructure interdependencies , 2001 .

[37]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[38]  L. Verbrugge Multiplexity in Adult Friendships , 1979 .

[39]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[40]  Gary D Bader,et al.  The Genetic Landscape of a Cell , 2010, Science.

[41]  October I Physical Review Letters , 2022 .

[42]  David M. Blei,et al.  Bayesian Poisson Tensor Factorization for Inferring Multilateral Relations from Sparse Dyadic Event Counts , 2015, KDD.

[43]  Tiago P. Peixoto Inferring the mesoscale structure of layered, edge-valued, and time-varying networks. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Peter Ladefoged,et al.  UCLA Working Papers in Phonetics, 23. , 1972 .

[45]  J. Jay Braun,et al.  Evolution and Human Behavior , 1967, The Yale Journal of Biology and Medicine.

[46]  K. Kojima Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. , 1969 .

[47]  Kathryn B. Laskey,et al.  Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence , 1999 .

[48]  Ben Carterette,et al.  Proceedings of the 7th ACM international conference on Web search and data mining , 2014, WSDM 2014.

[49]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[50]  W. Marsden I and J , 2012 .

[51]  L. Ward Social Forces , 1911, The Psychological Clinic.

[52]  David M Blei,et al.  Efficient discovery of overlapping communities in massive networks , 2013, Proceedings of the National Academy of Sciences.

[53]  Andrew G. Glen,et al.  APPL , 2001 .

[54]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[55]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[56]  Sabrina S Wilson Radiology , 1938, Glasgow Medical Journal.

[57]  Jure Leskovec,et al.  Detecting cohesive and 2-mode communities indirected and undirected networks , 2014, WSDM.

[58]  I. Ial,et al.  Nature Communications , 2010, Nature Cell Biology.

[59]  Mark E. J. Newman,et al.  An efficient and principled method for detecting communities in networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[60]  Susan T. Dumais,et al.  Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval , 2004, SIGIR 2004.

[61]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[62]  David M. Blei,et al.  Scalable Recommendation with Poisson Factorization , 2013, ArXiv.

[63]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[64]  John F. Canny,et al.  GaP: a factor model for discrete data , 2004, SIGIR '04.

[65]  Ali Jadbabaie,et al.  IEEE Transactions on Network Science and Engineering , 2014, IEEE Trans. Netw. Sci. Eng..