hier2vec: interpretable multi-granular representation learning for hierarchy in social networks

Network representation learning (NRL) maps vertices into latent vector space for further network inference. The existing algorithms concern more about whether the vectors of two similar nodes be close in latent vector space while the hierarchy proximity has been largely neglected by them. The distribution of the representation vectors needs to reflect the hierarchical structural properties which widely exist in networks. In this paper, we propose a novel network representation learning framework that can encode the interpretable hierarchical structural semantics into the representation vectors. Specifically, we measure the distance and importance degree of nodes in the original network and map the nodes to a tree space. This makes the hierarchical structural relations in the original network be clearly revealed by the tree which is also of good interpretability. In this paper, the local structural proximities and the interpretable hierarchy knowledge are encoded into vector space by optimizing the objective function. Extensive experiments conducted on the realistic data sets demonstrate that the proposed approach outperforms the existing state-of-the-art approaches on tasks of node classification, link prediction, and visualization. Finally, a case study is conducted for further analysis about how the proposed model works.

[1]  Abdolreza Mirzaei,et al.  A Novel Hierarchical-Clustering-Combination Scheme Based on Fuzzy-Similarity Relations , 2010, IEEE Transactions on Fuzzy Systems.

[2]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[3]  Chengqi Zhang,et al.  Network Representation Learning: A Survey , 2017, IEEE Transactions on Big Data.

[4]  Athman Bouguettaya,et al.  Efficient agglomerative hierarchical clustering , 2015, Expert Syst. Appl..

[5]  Konstantin Avrachenkov,et al.  Cooperative Game Theory Approaches for Network Partitioning , 2017, COCOON.

[6]  Shou-De Lin,et al.  MARINE: Multi-relational Network Embeddings with Relational Proximity and Node Attributes , 2019, WWW.

[7]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[8]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[9]  Chengqi Zhang,et al.  Collective Classification via Discriminative Matrix Factorization on Sparsely Labeled Networks , 2016, CIKM.

[10]  Ping Zhu,et al.  Hierarchical Clustering Problems and Analysis of Fuzzy Proximity Relation on Granular Space , 2013, IEEE Transactions on Fuzzy Systems.

[11]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[14]  Hamido Fujita,et al.  Hierarchical cluster ensemble model based on knowledge granulation , 2016, Knowl. Based Syst..

[15]  Roger Guimerà,et al.  Extracting the hierarchical organization of complex systems , 2007, Proceedings of the National Academy of Sciences.

[16]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[17]  Zoubin Ghahramani,et al.  Pitman Yor Diffusion Trees for Bayesian Hierarchical Clustering , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Cecile Cabanes,et al.  The CORA dataset: validation and diagnostics of in-situ ocean temperature and salinity measurements , 2013 .

[19]  Alexander J. Smola,et al.  Distributed large-scale natural graph factorization , 2013, WWW.

[20]  Xiao Huang,et al.  Accelerated Attributed Network Embedding , 2017, SDM.

[21]  Guoyin Wang,et al.  Multi-granularity Intelligent Information Processing , 2015, RSFDGrC.

[22]  Abdolreza Mirzaei,et al.  A hierarchical clusterer ensemble method based on boosting theory , 2013, Knowl. Based Syst..

[23]  Yuchen Li,et al.  BiANE: Bipartite Attributed Network Embedding , 2020, SIGIR.

[24]  Yuhua Qian,et al.  Three-way cognitive concept learning via multi-granularity , 2017, Inf. Sci..

[25]  Charu C. Aggarwal,et al.  Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015 , 2015, CIKM.

[26]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[27]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[28]  Jingrui He,et al.  Scalable Manifold-Regularized Attributed Network Embedding via Maximum Mean Discrepancy , 2019, CIKM.

[29]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[30]  Nicola Barbieri,et al.  Who to follow and why: link prediction with explanations , 2014, KDD.

[31]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[32]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[33]  Mark Heimann,et al.  Distribution of Node Embeddings as Multiresolution Features for Graphs , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[34]  Guoyin Wang,et al.  DenPEHC: Density peak based efficient hierarchical clustering , 2016, Inf. Sci..

[35]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[36]  Jean-Philippe Thiran,et al.  Cluster validity measure and merging system for hierarchical clustering considering outliers , 2015, Pattern Recognit..

[37]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[38]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.

[39]  Witold Pedrycz,et al.  Granular Computing: Perspectives and Challenges , 2013, IEEE Transactions on Cybernetics.

[40]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[41]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[42]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[43]  Theresa Kuchler,et al.  The Geographic Spread of Covid-19 Correlates with the Structure of Social Networks as Measured by Facebook , 2020, SSRN Electronic Journal.

[44]  Huan Liu,et al.  Leveraging social media networks for classification , 2011, Data Mining and Knowledge Discovery.

[45]  Nima Dehmamy Isotopy and energy of physical networks , 2020 .

[46]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[47]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[48]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.