Network Representation Learning: A Survey

With the widespread use of information technologies, information networks are becoming increasingly popular to capture complex relationships across various disciplines, such as social networks, citation networks, telecommunication networks, and biological networks. Analyzing these networks sheds light on different aspects of social life such as the structure of societies, information diffusion, and communication patterns. In reality, however, the large scale of information networks often makes network analytic tasks computationally expensive or intractable. Network representation learning has been recently proposed as a new learning paradigm to embed network vertices into a low-dimensional vector space, by preserving network topology structure, vertex content, and other side information. This facilitates the original network to be easily handled in the new vector space for further analysis. In this survey, we perform a comprehensive review of the current literature on network representation learning in the data mining and machine learning field. We propose new taxonomies to categorize and summarize the state-of-the-art network representation learning techniques according to the underlying learning mechanisms, the network information intended to preserve, as well as the algorithmic designs and methodologies. We summarize evaluation protocols used for validating network representation learning including published benchmark datasets, evaluation methods, and open source algorithms. We also perform empirical studies to compare the performance of representative algorithms on common datasets, and analyze their computational complexity. Finally, we suggest promising research directions to facilitate future study.

[1]  Xiao Huang,et al.  Label Informed Attributed Network Embedding , 2017, WSDM.

[2]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[3]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[6]  Huan Liu,et al.  Scalable learning of collective behavior based on sparse social dimensions , 2009, CIKM.

[7]  Ryan A. Rossi,et al.  Deep Feature Learning for Graphs , 2017, ArXiv.

[8]  Chengqi Zhang,et al.  Tri-Party Deep Network Representation , 2016, IJCAI.

[9]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[10]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[11]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[12]  Reynold Cheng,et al.  On Embedding Uncertain Graphs , 2017, CIKM.

[13]  Ludovic Denoyer,et al.  Temporal link prediction by integrating content and structure information , 2011, CIKM '11.

[14]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[15]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[16]  Wang-Chien Lee,et al.  HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning , 2017, CIKM.

[17]  Yao Zhang,et al.  Learning Node Embeddings in Interaction Graphs , 2017, CIKM.

[18]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[19]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[20]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[21]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[22]  Huan Liu,et al.  Paired Restricted Boltzmann Machine for Linked Data , 2016, CIKM.

[23]  Xuanjing Huang,et al.  Incorporate Group Information to Enhance Network Embedding , 2016, CIKM.

[24]  Dan Wang,et al.  Adversarial Network Embedding , 2017, AAAI.

[25]  Pascal Frossard,et al.  Chebyshev polynomial approximation for distributed signal processing , 2011, 2011 International Conference on Distributed Computing in Sensor Systems and Workshops (DCOSS).

[26]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[28]  Fernando Berzal Galiano,et al.  A Survey of Link Prediction in Complex Networks , 2016, ACM Comput. Surv..

[29]  Rushed Kanawati,et al.  Link Prediction in Complex Networks , 2020, Cognitive Analytics.

[30]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Charu C. Aggarwal,et al.  Linked Document Embedding for Classification , 2016, CIKM.

[32]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[33]  Nikos Mamoulis,et al.  Heterogeneous Information Network Embedding for Meta Path based Proximity , 2017, ArXiv.

[34]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[35]  Michiel E. Hochstenbach,et al.  A Jacobi-Davidson type method for the generalized singular value problem , 2009 .

[36]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[37]  Xiao Huang,et al.  Exploring Expert Cognition for Attributed Network Embedding , 2018, WSDM.

[38]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[39]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[40]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[41]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[42]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[43]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[44]  Charu C. Aggarwal,et al.  Signed Network Embedding in Social Media , 2017, SDM.

[45]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[46]  Yuan Zhang,et al.  Enhancing the Network Embedding Quality with Structural Similarity , 2017, CIKM.

[47]  Jian Pei,et al.  A Survey on Network Embedding , 2017, IEEE Transactions on Knowledge and Data Engineering.

[48]  Bo Zhang,et al.  Discriminative Deep Random Walk for Network Classification , 2016, ACL.

[49]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[50]  Wei Lu,et al.  Deep Neural Networks for Learning Graph Representations , 2016, AAAI.

[51]  Jingzhou Liu,et al.  Visualizing Large-scale and High-dimensional Data , 2016, WWW.

[52]  Zhaochun Ren,et al.  Multi-Dimensional Network Embedding with Hierarchical Structure , 2018, WSDM.

[53]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[54]  Charu C. Aggarwal,et al.  Attributed Signed Network Embedding , 2017, CIKM.

[55]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[56]  Michalis Vazirgiannis,et al.  Clustering and Community Detection in Directed Networks: A Survey , 2013, ArXiv.

[57]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[58]  Shaowen Wang,et al.  Regions, Periods, Activities: Uncovering Urban Dynamics via Cross-Modal Representation Learning , 2017, WWW.

[59]  Weitong Chen,et al.  Learning Graph-based POI Embedding for Location-based Recommendation , 2016, CIKM.

[60]  Ivan Herman,et al.  Graph Visualization and Navigation in Information Visualization: A Survey , 2000, IEEE Trans. Vis. Comput. Graph..

[61]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[62]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[63]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[64]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[65]  Jiawei Han,et al.  An Attention-based Collaboration Framework for Multi-View Network Representation Learning , 2017, CIKM.

[66]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[67]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[68]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[69]  Minyi Guo,et al.  SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction , 2017, WSDM.

[70]  Hao Wu,et al.  Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content , 2015, WWW.

[71]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[72]  Przemyslaw Kazienko,et al.  Label-dependent node classification in the network , 2012, Neurocomputing.

[73]  Changping Wang,et al.  RSDNE: Exploring Relaxed Similarity and Dissimilarity from Completely-Imbalanced Labels for Network Embedding , 2018, AAAI.

[74]  Jure Leskovec,et al.  Learning Structural Node Embeddings via Diffusion Wavelets , 2017, KDD.

[75]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.

[76]  Kevin Chen-Chuan Chang,et al.  A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[77]  Chengqi Zhang,et al.  Collective Classification via Discriminative Matrix Factorization on Sparsely Labeled Networks , 2016, CIKM.

[78]  Yueting Zhuang,et al.  Dynamic Network Embedding by Modeling Triadic Closure Process , 2018, AAAI.

[79]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[80]  Jugal K. Kalita,et al.  Network Anomaly Detection: Methods, Systems and Tools , 2014, IEEE Communications Surveys & Tutorials.

[81]  Kevin Chen-Chuan Chang,et al.  Distance-Aware DAG Embedding for Proximity Search on Heterogeneous Graphs , 2018, AAAI.

[82]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[83]  Chang Zhou,et al.  Scalable Graph Embedding for Asymmetric Proximity , 2017, AAAI.

[84]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[85]  Xiaoming Zhang,et al.  From Properties to Links: Deep Network Embedding on Incomplete Graphs , 2017, CIKM.

[86]  Huan Liu,et al.  Leveraging social media networks for classification , 2011, Data Mining and Knowledge Discovery.

[87]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[88]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[89]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[90]  Minyi Guo,et al.  GraphGAN: Graph Representation Learning with Generative Adversarial Nets , 2017, AAAI.

[91]  Yuan Zhang,et al.  COSINE: Community-Preserving Social Network Embedding From Information Diffusion Cascades , 2018, AAAI.

[92]  Yuxin Chen,et al.  HINE: Heterogeneous Information Network Embedding , 2017, DASFAA.

[93]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[94]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[95]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[96]  Chengqi Zhang,et al.  Homophily, Structure, and Content Augmented Network Representation Learning , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[97]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[98]  Yin Zhang,et al.  Scalable proximity estimation and link prediction in online social networks , 2009, IMC '09.

[99]  Jure Leskovec,et al.  Spectral Graph Wavelets for Structural Role Similarity in Networks , 2017, ArXiv.

[100]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.

[101]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[102]  Xiaochun Cao,et al.  Multi-Facet Network Embedding: Beyond the General Solution of Detection and Representation , 2018, AAAI.

[103]  Steven Skiena,et al.  HARP: Hierarchical Representation Learning for Networks , 2017, AAAI.

[104]  Xiaoming Zhang,et al.  PPNE: Property Preserving Network Embedding , 2017, DASFAA.

[105]  Vinith Misra,et al.  Bernoulli Embeddings for Graphs , 2018, AAAI.

[106]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[107]  Luis G. Moyano,et al.  Learning network representations , 2017, The European Physical Journal Special Topics.

[108]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[109]  Fei Wang,et al.  Structural Deep Embedding for Hyper-Networks , 2017, AAAI.

[110]  Chengqi Zhang,et al.  User Profile Preserving Social Network Embedding , 2017, IJCAI.

[111]  Nagarajan Natarajan,et al.  Inductive matrix completion for predicting gene–disease associations , 2014, Bioinform..

[112]  Graham Cormode,et al.  Node Classification in Social Networks , 2011, Social Network Data Analytics.

[113]  Wenwu Zhu,et al.  DepthLGP: Learning Embeddings of Out-of-Sample Nodes in Dynamic Networks , 2018, AAAI.

[114]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[115]  Minlie Huang,et al.  GAKE: Graph Aware Knowledge Embedding , 2016, COLING.

[116]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[117]  Zhiyuan Liu,et al.  Max-Margin DeepWalk: Discriminative Learning of Network Representation , 2016, IJCAI.

[118]  Kevin Chen-Chuan Chang,et al.  Learning Community Embedding with Community Detection and Node Embedding on Graphs , 2017, CIKM.

[119]  Lina Yao,et al.  Adversarially Regularized Graph Autoencoder , 2018, ArXiv.

[120]  Chun Wang,et al.  MGAE: Marginalized Graph Autoencoder for Graph Clustering , 2017, CIKM.

[121]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[122]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[123]  Hongfei Yan,et al.  TLINE: Scalable Transductive Network Embedding , 2016, AIRS.

[124]  Huan Liu,et al.  Attributed Network Embedding for Learning in a Dynamic Environment , 2017, CIKM.

[125]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[126]  Yang Yang,et al.  Representation Learning for Scale-free Networks , 2017, AAAI.

[127]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[128]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[129]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.