Learning Network Embedding with Community Structural Information

Network embedding is an effective approach to learn the low-dimensional representations of vertices in networks, aiming to capture and preserve the structure and inherent properties of networks. The vast majority of existing network embedding methods exclusively focus on vertex proximity of networks, while ignoring the network internal community structure. However, the homophily principle indicates that vertices within the same community are more similar to each other than those from different communities, thus vertices within the same community should have similar vertex representations. Motivated by this, we propose a novel network embedding framework NECS to learn the Network Embedding with Community Structural information, which preserves the high-order proximity and incorporates the community structure in vertex representation learning. We formulate the problem into a principled optimization framework and provide an effective alternating algorithm to solve it. Extensive experimental results on several benchmark network datasets demonstrate the effectiveness of the proposed framework in various network analysis tasks including network reconstruction, link prediction and vertex classification.

[1]  David J. Hand,et al.  Statistics and computing: the genesis of data science , 2015, Statistics and Computing.

[2]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[3]  Kevin Chen-Chuan Chang,et al.  Learning Community Embedding with Community Detection and Node Embedding on Graphs , 2017, CIKM.

[4]  Heng Huang,et al.  Self-Paced Network Embedding , 2018, KDD.

[5]  Xing Xie,et al.  High-order Proximity Preserving Information Network Hashing , 2018, KDD.

[6]  P. Van Mieghem,et al.  THE EUROPEAN PHYSICAL JOURNAL B Maximum modular graphs , 2012 .

[7]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jiawei Han,et al.  ACM Transactions on Knowledge Discovery from Data: Introduction , 2007 .

[10]  Jian Pei,et al.  Arbitrary-Order Proximity Preserved Network Embedding , 2018, KDD.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[13]  Kun He,et al.  Local Spectral Clustering for Overlapping Community Detection , 2018, ACM Trans. Knowl. Discov. Data.

[14]  Noah A. Smith,et al.  Proceedings of NIPS , 2010, NIPS 2010.

[15]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[16]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[17]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[18]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[19]  Zhiyuan Liu,et al.  Fast Network Embedding Enhancement via High Order Proximity Approximation , 2017, IJCAI.

[20]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[21]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[23]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[24]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[25]  Lukasz Kurgan,et al.  Data Mining and Knowledge Discovery Data Mining and Knowledge Discovery , 2002 .

[26]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[27]  Stefanos Zafeiriou,et al.  Non-Negative Matrix Factorizations for Multiplex Network Analysis , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.

[29]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[30]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[32]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.

[33]  Huan Liu,et al.  Leveraging social media networks for classification , 2011, Data Mining and Knowledge Discovery.

[34]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[35]  Gesine Reinert,et al.  Efficient method for estimating the number of communities in a network , 2017, Physical review. E.

[36]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.