A Generative Model for Exploring Structure Regularities in Attributed Networks

Many real-world networks known as attributed networks contain two types of information: topology information and node attributes. It is a challenging task on how to use these two types of information to explore structural regularities. In this paper, by characterizing potential relationship between link communities and node attributes, a principled statistical model named PSB_PG that generates link topology and node attributes is proposed. This model for generating links is based on the stochastic blockmodels following a Poisson distribution. Therefore, it is capable of detecting a wide range of network structures including community structures, bipartite structures and other mixture structures. The model for generating node attributes assumes that node attributes are high dimensional and sparse and also follow a Poisson distribution. This makes the model be uniform and the model parameters can be directly estimated by expectation-maximization (EM) algorithm. Experimental results on artificial networks and real networks containing various structures have shown that the proposed model PSB_PG is not only competitive with the state-of-the-art models, but also provides good semantic interpretation for each community via the learned relationship between the community and its related attributes.

[1]  Weixiong Zhang,et al.  Joint Identification of Network Communities and Semantics via Integrative Modeling of Network Topologies and Node Contents , 2017, AAAI.

[2]  Mark E. J. Newman,et al.  An efficient and principled method for detecting communities in networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Hong Cheng,et al.  A model-based approach to attributed graph clustering , 2012, SIGMOD Conference.

[4]  Reynold Cheng,et al.  Effective Community Search for Large Attributed Graphs , 2016, Proc. VLDB Endow..

[5]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[6]  Srinivasan Parthasarathy,et al.  Efficient community detection in large networks using content and links , 2012, WWW.

[7]  Hong Cheng,et al.  GBAGC: A General Bayesian Framework for Attributed Graph Clustering , 2014, TKDD.

[8]  Naghmeh Momeni,et al.  Effect of node attributes on the temporal dynamics of network structure. , 2017, Physical review. E.

[9]  Hong Cheng,et al.  Clustering Large Attributed Graphs: An Efficient Incremental Approach , 2010, 2010 IEEE International Conference on Data Mining.

[10]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[11]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[12]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Christos Faloutsos,et al.  PICS: Parameter-free Identification of Cohesive Subgroups in Large Attributed Graphs , 2012, SDM.

[15]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[16]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[17]  Buzhou Tang,et al.  Network structure exploration in networks with node attributes , 2016 .

[18]  Mark E. J. Newman,et al.  Structure and inference in annotated networks , 2015, Nature Communications.

[19]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[20]  Yun Q. Shi,et al.  An Enhanced EM algorithm using maximum entropy distribution as initial condition , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[21]  Jian Yu,et al.  Locally Weighted Fusion of Structural and Attribute Information in Graph Clustering , 2019, IEEE Transactions on Cybernetics.

[22]  Hong Cheng,et al.  Graph Clustering Based on Structural/Attribute Similarities , 2009, Proc. VLDB Endow..

[23]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[25]  Christophe Ambroise,et al.  Clustering based on random graph model embedding vertex features , 2009, Pattern Recognit. Lett..

[26]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[27]  Jian Yu,et al.  Node Attribute-enhanced Community Detection in Complex Networks , 2017, Scientific Reports.

[28]  Ge Zhang,et al.  Finding Communities with Hierarchical Semantics by Distinguishing General and Specialized topics , 2018, IJCAI.

[29]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[30]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[31]  Xueqi Cheng,et al.  Exploring the structural regularities in networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Yihong Gong,et al.  Directed Network Community Detection: A Popularity and Productivity Link Model , 2010, SDM.

[33]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[35]  Bofeng Zhang,et al.  Overlapping community detection in heterogeneous social networks via the user model , 2018, Inf. Sci..

[36]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[37]  Derek Greene,et al.  Normalized Mutual Information to evaluate overlapping community finding algorithms , 2011, ArXiv.

[38]  Hong Cheng,et al.  Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities , 2011, TKDD.

[39]  Derek Greene,et al.  Producing a unified graph representation from multiple social network views , 2013, WebSci.

[40]  Aarnout Brombacher,et al.  Probability... , 2009, Qual. Reliab. Eng. Int..

[41]  Longbing Cao,et al.  Coupled Node Similarity Learning for Community Detection in Attributed Networks , 2018, Entropy.

[42]  Jian Yu,et al.  Combining a popularity-productivity stochastic block model with a discriminative content model for detecting general structures , 2013 .

[43]  Hong Cheng,et al.  Dense community detection in multi-valued attributed networks , 2015, Inf. Sci..

[44]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[45]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[46]  András A. Benczúr,et al.  An efficient block model for clustering sparse graphs , 2010, MLG '10.

[47]  Zhihua Zhang,et al.  Generalized Latent Factor Models for Social Network Analysis , 2011, IJCAI.