Deep Variational Network Embedding in Wasserstein Space

Network embedding, aiming to embed a network into a low dimensional vector space while preserving the inherent structural properties of the network, has attracted considerable attentions recently. Most of the existing embedding methods embed nodes as point vectors in a low-dimensional continuous space. In this way, the formation of the edge is deterministic and only determined by the positions of the nodes. However, the formation and evolution of real-world networks are full of uncertainties, which makes these methods not optimal. To address the problem, we propose a novel Deep Variational Network Embedding in Wasserstein Space (DVNE) in this paper. The proposed method learns a Gaussian distribution in the Wasserstein space as the latent representation of each node, which can simultaneously preserve the network structure and model the uncertainty of nodes. Specifically, we use 2-Wasserstein distance as the similarity measure between the distributions, which can well preserve the transitivity in the network with a linear computational cost. Moreover, our method implies the mathematical relevance of mean and variance by the deep variational model, which can well capture the position of the node by the mean vectors and the uncertainties of nodes by the variance. Additionally, our method captures both the local and global network structure by preserving the first-order and second-order proximity in the network. Our experimental results demonstrate that our method can effectively model the uncertainty of nodes in networks, and show a substantial gain on real-world applications such as link prediction and multi-label classification compared with the state-of-the-art methods.

[1]  Nikos Mamoulis,et al.  Heterogeneous Information Network Embedding for Meta Path based Proximity , 2017, ArXiv.

[2]  V. Bryant Metric Spaces: Iteration and Application , 1985 .

[3]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[4]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[5]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[6]  Christos Faloutsos,et al.  Long Short Memory Process: Modeling Growth Dynamics of Microscopic Social Connectivity , 2017, KDD.

[7]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[8]  Ludovic Dos Santos,et al.  Multilabel Classification on Heterogeneous Graphs with Gaussian Embeddings , 2016, ECML/PKDD.

[9]  Mathieu Desbrun,et al.  Blue noise through optimal transport , 2012, ACM Trans. Graph..

[10]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[11]  L. Ambrosio,et al.  Chapter 1 – Gradient Flows of Probability Measures , 2007 .

[12]  Xuelong Li,et al.  Unsupervised Large Graph Embedding , 2017, AAAI.

[13]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[14]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[15]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[16]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[17]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[18]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[19]  M. V. D. Panne,et al.  Displacement Interpolation Using Lagrangian Mass Transport , 2011 .

[20]  Nicolas Courty,et al.  Learning Wasserstein Embeddings , 2017, ICLR.

[21]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[22]  Nicolas Courty,et al.  Optimal Transport for Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[24]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.

[25]  Arindam Banerjee,et al.  Bregman Alternating Direction Method of Multipliers , 2013, NIPS.

[26]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[27]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Attributed Graphs: Unsupervised Inductive Learning via Ranking , 2017, ArXiv.

[28]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[29]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[30]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[31]  Jun Zhao,et al.  Learning to Represent Knowledge Graphs with Gaussian Embedding , 2015, CIKM.

[32]  Fei Wang,et al.  Structural Deep Embedding for Hyper-Networks , 2017, AAAI.

[33]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[34]  Andrew McCallum,et al.  Word Representations via Gaussian Embedding , 2014, ICLR.

[35]  Hanghang Tong,et al.  Fast Eigen-Functions Tracking on Dynamic Graphs , 2015, SDM.

[36]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[37]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[38]  C. Givens,et al.  A class of Wasserstein metrics for probability distributions. , 1984 .

[39]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[40]  Julien Rabin,et al.  Sliced and Radon Wasserstein Barycenters of Measures , 2014, Journal of Mathematical Imaging and Vision.

[41]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[42]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[43]  Christos Faloutsos,et al.  Fast, Warped Graph Embedding: Unifying Framework and One-Click Algorithm , 2017, ArXiv.

[44]  Flemming Topsøe,et al.  Jensen-Shannon divergence and Hilbert space embedding , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[45]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  W. Desch,et al.  An elementary proof of the triangle inequality for the Wasserstein metric , 2008 .

[47]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[48]  P. Holland,et al.  Holland and Leinhardt Reply: Some Evidence on the Transitivity of Positive Interpersonal Sentiment , 1972, American Journal of Sociology.