Efficient personalized community detection via genetic evolution

Personalized community detection aims to generate communities associated with user need on graphs, which benefits many downstream tasks such as node recommendation and link prediction for users, etc. It is of great importance but lack of enough attention in previous studies which are on topics of user-independent, semi-supervised, or top-K user-centric community detection. Meanwhile, most of their models are time consuming due to the complex graph structure. Different from these topics, personalized community detection requires to provide higher-resolution partition on nodes that are more relevant to user need while coarser manner partition on the remaining less relevant nodes. In this paper, to solve this task in an efficient way, we propose a genetic model including an offline and an online step. In the offline step, the user-independent community structure is encoded as a binary tree. And subsequently an online genetic pruning step is applied to partition the tree into communities. To accelerate the speed, we also deploy a distributed version of our model to run under parallel environment. Extensive experiments on multiple datasets show that our model outperforms the state-of-arts with significantly reduced running time.

[1]  Xiaozhong Liu,et al.  How others affect your Twitter #hashtag adoption? Examination of community-based and context-based information diffusion in Twitter , 2016 .

[2]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[4]  Hong Cheng,et al.  Finding top-k similar graphs in graph databases , 2012, EDBT '12.

[5]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[8]  Gang Fu,et al.  edge2vec: Representation learning using edge semantics for biomedical knowledge discovery , 2018, BMC Bioinformatics.

[9]  Xiaochun Cao,et al.  Modularity Based Community Detection with Deep Learning , 2016, IJCAI.

[10]  Zheng Gao,et al.  Personalized community detection in scholarly network , 2017 .

[11]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[12]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[13]  M. Tahar Kechadi,et al.  A Framework for Genetic Algorithms Based on Hadoop , 2013, ArXiv.

[14]  Reem Bahgat,et al.  Utilizing deep learning for content-based community detection , 2014, 2014 Science and Information Conference.

[15]  Jiawei Han,et al.  Top-K interesting subgraph discovery in information networks , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[16]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[17]  Marc Peter Deisenroth,et al.  Real-time community detection in full social networks on a laptop , 2018, PloS one.

[18]  Paul D. White,et al.  impact of an extreme observation in a paired samples design , 2017, Advances in Methodology and Statistics.

[19]  Martin Rosvall,et al.  Multilevel Compression of Random Walks on Networks Reveals Hierarchical Organization in Large Integrated Systems , 2010, PloS one.

[20]  Jian Yu,et al.  Enhanced semi-supervised community detection with active node and link selection , 2018, Physica A: Statistical Mechanics and its Applications.

[21]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Johan Bollen,et al.  Comparing Community-based Information Adoption and Diffusion Across Different Microblogging Sites , 2016, HT.

[23]  Quan Wang,et al.  Community Detection in Multi-Layer Networks Using Joint Nonnegative Matrix Factorization , 2019, IEEE Transactions on Knowledge and Data Engineering.

[24]  Niloy Ganguly,et al.  Metrics for Community Analysis , 2016, ACM Comput. Surv..

[25]  Xiaozhong Liu,et al.  Internal/External information access and information diffusion in social media , 2017 .

[26]  Lin Zhang,et al.  Synopsizing “literature review” for scientific publications , 2016 .

[27]  Ray R. Larson Introduction to Information Retrieval , 2010 .

[28]  Pasquale De Meo,et al.  A Novel Measure of Edge Centrality in Social Networks , 2012, Knowl. Based Syst..

[29]  Lei Zou,et al.  Top-k subgraph matching query in a large graph , 2007, PIKM '07.

[30]  Evangelos E. Papalexakis,et al.  SMACD: Semi-supervised Multi-Aspect Community Detection , 2018, SDM.

[31]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[32]  Raquel Urtasun,et al.  Deep Spectral Clustering Learning , 2017, ICML.

[33]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[34]  Eric Eaton,et al.  A Spin-Glass Model for Semi-Supervised Community Detection , 2012, AAAI.

[35]  Jure Leskovec,et al.  Higher-order organization of complex networks , 2016, Science.

[36]  D. Fogel Evolutionary algorithms in theory and practice , 1997, Complex..

[37]  James Bailey,et al.  Lagrangian Constrained Community Detection , 2018, AAAI.

[38]  Xiao Liu,et al.  A Unified Weakly Supervised Framework for Community Detection and Semantic Matching , 2018, PAKDD.

[39]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[40]  Xiaoke Ma,et al.  Semi-supervised spectral algorithms for community detection in complex networks based on equivalence of clustering methods , 2018 .

[41]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[42]  Gang Fu,et al.  edge2vec: Learning Node Representation Using Edge Semantics , 2018, ArXiv.

[43]  Xiaoke Ma,et al.  Semi-supervised clustering algorithm for community structure detection in complex networks , 2010 .

[44]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[45]  Krikamol Muandet,et al.  Minimax Estimation of Kernel Mean Embeddings , 2016, J. Mach. Learn. Res..