Learning Deep Representations in Large Integrated Network for Graph Clustering

Social communities, which are closely related groups based on some characteristics, are common structures hidden in social networks. In general, users tend to cluster in a group because of similar interests or frequent interactions. The identification of such communities in the social networks is of methodological and practical value. Existing methods are limited in that user network needs learning from heterogeneous networks, the complete feature expression of each user is ignored. To address this challenge, we propose a novel model, called DeepInNet. This method obtains a comprehensive deep representation by learning the information of different modes, which are presented by the corresponding network structure and represent the heterogeneous information in the social network. To perform the task, DeepInNet first computes the various diffusion state of each node from the heterogeneous network as features. Given the integrated network representation, we introduced a stacked auto-encoder model to form the deep neural network and learn the deep representation. Such low-dimensional representations could be used to cluster interest communities in the social network quickly. DeepInNet has been tested with four real-world datasets include two large-scale datasets. It also has been compared with several common approaches to social network clustering. The experimental results show that the integrated deep representation found by DeepInNet may match well with the known social communities and it is able to outperform the state-of-the-art approaches to analyzing the large-scale social network.

[1]  Jure Leskovec,et al.  Discovering social circles in ego networks , 2012, ACM Trans. Knowl. Discov. Data.

[2]  Mason A. Porter,et al.  Comparing Community Structure to Characteristics in Online Collegiate Social Networks , 2008, SIAM Rev..

[3]  Xiaoming Fu,et al.  Mining triadic closure patterns in social networks , 2014, WWW.

[4]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[5]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[6]  Zhuowen Tu,et al.  Similarity network fusion for aggregating data types on a genomic scale , 2014, Nature Methods.

[7]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[8]  Xiaoming Fu,et al.  Triadic Closure Pattern Analysis and Prediction in Social Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[9]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[10]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[11]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[12]  Shuai Li,et al.  Collaborative Filtering Bandits , 2015, SIGIR.

[13]  Keith C. C. Chan,et al.  MISAGA: An Algorithm for Mining Interesting Subgraphs in Attributed Graphs , 2018, IEEE Transactions on Cybernetics.

[14]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[15]  Keith C. C. Chan,et al.  Evolutionary Graph Clustering for Protein Complex Identification , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  Keith C. C. Chan,et al.  Utilizing Both Topological and Attribute Information for Protein Complex Identification in PPI Networks , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Pascal Vasseur,et al.  Introduction to Multisensor Data Fusion , 2005, The Industrial Information Technology Handbook.