Understanding graph embedding methods and their applications

Graph analytics can lead to better quantitative understanding and control of complex networks, but traditional methods suffer from high computational cost and excessive memory requirements associated with the high-dimensionality and heterogeneous characteristics of industrial size networks. Graph embedding techniques can be effective in converting high-dimensional sparse graphs into low-dimensional, dense and continuous vector spaces, preserving maximally the graph structure properties. Another type of emerging graph embedding employs Gaussian distribution-based graph embedding with important uncertainty estimation. The main goal of graph embedding methods is to pack every node's properties into a vector with a smaller dimension, hence, node similarity in the original complex irregular spaces can be easily quantified in the embedded vector spaces using standard metrics. The generated nonlinear and highly informative graph embeddings in the latent space can be conveniently used to address different downstream graph analytics tasks (e.g., node classification, link prediction, community detection, visualization, etc.). In this Review, we present some fundamental concepts in graph analytics and graph embedding methods, focusing in particular on random walk-based and neural network-based methods. We also discuss the emerging deep learning-based dynamic graph embedding methods. We highlight the distinct advantages of graph embedding methods in four diverse applications, and present implementation details and references to open-source software as well as available databases in the Appendix for the interested readers to start their exploration into graph analytics.

[1]  Jun Zhao,et al.  Learning to Represent Knowledge Graphs with Gaussian Embedding , 2015, CIKM.

[2]  Jian Tang,et al.  GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding , 2019, WWW.

[3]  Jian Pei,et al.  TIMERS: Error-Bounded SVD Restart on Dynamic Networks , 2017, AAAI.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  Philip S. Yu,et al.  DynGraphGAN: Dynamic Graph Embedding via Generative Adversarial Networks , 2019, DASFAA.

[6]  Alex Smola,et al.  Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.

[7]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[8]  Jure Leskovec,et al.  Image Labeling on a Network: Using Social-Network Metadata for Image Classification , 2012, ECCV.

[9]  Sheng Wang,et al.  Gaussian Embedding for Large-scale Gene Set Analysis , 2020, Nature Machine Intelligence.

[10]  Hong Zhao,et al.  A deep convolutional neural network for classification of red blood cells in sickle cell anemia , 2017, PLoS Comput. Biol..

[11]  Hongyuan Zha,et al.  DyRep: Learning Representations over Dynamic Graphs , 2019, ICLR.

[12]  Ryan A. Rossi,et al.  Dynamic Network Embeddings: From Random Walks to Temporal Random Walks , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[13]  Durga Prasanna Misra,et al.  Protein protein interaction network analysis of differentially expressed genes to understand involved biological processes in coronary artery disease and its different severity , 2018, Gene Reports.

[14]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[15]  Palash Goyal,et al.  dyngraph2vec: Capturing Network Dynamics using Dynamic Graph Representation Learning , 2018, Knowl. Based Syst..

[16]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[17]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[18]  Pranav NERURKAR,et al.  Survey of network embedding techniques for social networks , 2019 .

[19]  Ganesh Ramakrishnan,et al.  DynGAN: Generative Adversarial Networks for Dynamic Network Embedding , 2019 .

[20]  Chengqi Zhang,et al.  User Profile Preserving Social Network Embedding , 2017, IJCAI.

[21]  Huan Liu,et al.  Deep Anomaly Detection on Attributed Networks , 2019, SDM.

[22]  Jian Pei,et al.  A Survey on Network Embedding , 2017, IEEE Transactions on Knowledge and Data Engineering.

[23]  Thomas de Quincey [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.

[24]  C. Bayan Bruss,et al.  DeepTrax: Embedding Graphs of Financial Transactions , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[25]  Huan Liu,et al.  Attributed Network Embedding for Learning in a Dynamic Environment , 2017, CIKM.

[26]  Enrico Amico,et al.  Mapping higher-order relations between brain structure and function with embedded vector representations of connectomes , 2018, Nature Communications.

[27]  Shin'ichi Satoh,et al.  Community Detection Using Restrained Random-Walk Similarity , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Yichen Zhou,et al.  Asset diversification and systemic risk in the financial system , 2019 .

[29]  Carl T. Bergstrom,et al.  A Recommendation System Based on Hierarchical Clustering of an Article-Level Citation Network , 2016, IEEE Transactions on Big Data.

[30]  Xia Hu,et al.  Deep Representation Learning for Social Network Analysis , 2019, Front. Big Data.

[31]  Wenwu Zhu,et al.  Deep Variational Network Embedding in Wasserstein Space , 2018, KDD.

[32]  Yuhui Shi,et al.  Continuous-Time Link Prediction via Temporal Dependent Graph Neural Network , 2020, WWW.

[33]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[34]  Lingfan Yu,et al.  Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. , 2019 .

[35]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[36]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[37]  Fei Wang,et al.  Network embedding in biomedical data science , 2018, Briefings Bioinform..

[38]  George Karypis,et al.  RecWalk: Nearly Uncoupled Random Walks for Top-N Recommendation , 2019, WSDM.

[39]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[40]  Quanzheng Li,et al.  Gaussian embedding-based functional brain connectomic analysis for amnestic mild cognitive impairment patients with cognitive training , 2019, bioRxiv.

[41]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[42]  Kevin Chen-Chuan Chang,et al.  A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[43]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[44]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[45]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[46]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[47]  Jianbo Shi,et al.  Convolutional Random Walk Networks for Semantic Image Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Jie Chen,et al.  EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs , 2020, AAAI.

[49]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[50]  Jian Pei,et al.  High-Order Proximity Preserved Embedding for Dynamic Networks , 2018, IEEE Transactions on Knowledge and Data Engineering.

[51]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[52]  Markus Ringnér,et al.  What is principal component analysis? , 2008, Nature Biotechnology.

[53]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[54]  Quanzheng Li,et al.  A Graph Gaussian Embedding Method for Predicting Alzheimer's Disease Progression With MEG Brain Networks , 2020, IEEE Transactions on Biomedical Engineering.

[55]  Dimitrios Pantazis,et al.  A new Graph Gaussian embedding method for analyzing the effects of cognitive training , 2020, PLoS Comput. Biol..

[56]  Aijun An,et al.  dynnode2vec: Scalable Dynamic Network Embedding , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[57]  C. Bayan Bruss,et al.  Graph Embeddings at Scale , 2019, ArXiv.

[58]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[59]  Sheng Li,et al.  Graph embedding and unsupervised learning predict genomic sub-compartments from HiC chromatin interaction data , 2020, Nature Communications.

[60]  Lei Jia,et al.  Chemi-Net: A Molecular Graph Convolutional Network for Accurate Drug Property Prediction , 2018, International journal of molecular sciences.

[61]  Fabrice Wendling,et al.  Brain network similarity: methods and applications , 2019, Network Neuroscience.

[62]  Liang Gou,et al.  DySAT: Deep Neural Representation Learning on Dynamic Graphs via Self-Attention Networks , 2020, WSDM.

[63]  Chunyan Feng,et al.  User identity linkage across social networks via linked heterogeneous network embedding , 2018, World Wide Web.

[64]  Andrew McCallum,et al.  Word Representations via Gaussian Embedding , 2014, ICLR.

[65]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[66]  Weiwei Liu,et al.  Discrete Network Embedding , 2018, IJCAI.

[67]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[68]  Yan Liu,et al.  DynGEM: Deep Embedding Method for Dynamic Graphs , 2018, ArXiv.

[69]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[70]  Charu C. Aggarwal,et al.  NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks , 2018, KDD.

[71]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.