On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. Such approaches require both 1-edges and 0-edges (missing edges) to be provided as input, and as a consequence, approximate inference algorithms for these models usually require Ω(N2) time per iteration, precluding their application to larger real-world networks. In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is Θ(∑i D2i) (where Di is the degree of vertex i), which is much smaller than N2 for low-maximum-degree networks. Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. For networks with high maximum degree, the triangular motifs can be naturally subsampled in a node-centric fashion, allowing for much faster inference at a small cost in accuracy. Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. We conclude with a large-scale demonstration on an N ≈ 280,000-node network, which is infeasible for network models with Ω(N2) inference cost.

[1]  Charalampos E. Tsourakakis Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[2]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[3]  Karsten M. Borgwardt,et al.  The graphlet spectrum , 2009, ICML '09.

[4]  Tom A. B. Snijders,et al.  Markov Chain Monte Carlo Estimation of Exponential Random Graph Models , 2002, J. Soc. Struct..

[5]  M. Keeling,et al.  Networks and epidemic models , 2005, Journal of The Royal Society Interface.

[6]  Le Song,et al.  A Multiscale Community Blockmodel for Network Exploration , 2011, AISTATS.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Fei-Fei Li,et al.  Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  David Krackhardt,et al.  Heider vs Simmel: Emergent Features in Dynamic Structures , 2006, SNA@ICML.

[10]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[11]  Deepayan Chakrabarti,et al.  Preserving Personalized Pagerank in Subgraphs , 2011, ICML.

[12]  Eric P. Xing,et al.  Document hierarchies from text and links , 2012, WWW.

[13]  Stefano Soatto,et al.  Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Ning Chen,et al.  Infinite Latent SVM for Classification and Multi-task Learning , 2011, NIPS.

[16]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[17]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[18]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models for regression and classification , 2009, ICML '09.

[19]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[21]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[22]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[23]  G. Simmel The sociology of Georg Simmel , 1950 .

[24]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[25]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.