Kernel Node Embeddings

Learning representations of nodes in a low dimensional space is a crucial task with many interesting applications in network analysis, including link prediction and node classifi-cation. Two popular approaches for this problem include matrix factorization and random walk-based models. In this paper, we aim to bring together the best of both worlds, towards learning latent node representations. In particular, we propose a weighted matrix factorization model which encodes random walk-based information about the nodes of the graph. The main benefit of this formulation is that it allows to utilize kernel functions on the computation of the embeddings. We perform an empirical evaluation on real-world networks, showing that the proposed model outperforms baseline node embedding algorithms in two downstream machine learning tasks.

[1]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[2]  Fragkiskos D. Malliaros,et al.  BiasedWalk: Biased Sampling for Representation Learning on Graphs , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[3]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[4]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[5]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[6]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[7]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[8]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[9]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[10]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[11]  Alessandro Epasto,et al.  Is a Single Embedding Enough? Learning Node Representations that Capture Multiple Social Contexts , 2019, WWW.

[12]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[13]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[14]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[15]  Chengqi Zhang,et al.  Network Representation Learning: A Survey , 2017, IEEE Transactions on Big Data.

[16]  Jian Li,et al.  NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization , 2019, WWW.

[17]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[18]  Steven Skiena,et al.  HARP: Hierarchical Representation Learning for Networks , 2017, AAAI.

[19]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[20]  Steven Skiena,et al.  Don't Walk, Skip!: Online Learning of Multi-scale Network Embeddings , 2016, ASONAM.

[21]  Prakash Ishwar,et al.  Node embedding for network community discovery , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Georgios B. Giannakis,et al.  Adaptive Diffusions for Scalable Learning Over Graphs , 2018, IEEE Transactions on Signal Processing.

[23]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[24]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[25]  Charu C. Aggarwal,et al.  Kernelized Matrix Factorization for Collaborative Filtering , 2016, SDM.

[26]  Fragkiskos D. Malliaros,et al.  TNE: A Latent Model for Representation Learning on Networks , 2018, NIPS 2018.

[27]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[28]  Tommi S. Jaakkola,et al.  Weighted Low-Rank Approximations , 2003, ICML.

[29]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[30]  L. Bottou Stochastic Gradient Learning in Neural Networks , 1991 .

[31]  Ryan A. Rossi,et al.  Dynamic Network Embeddings: From Random Walks to Temporal Random Walks , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[32]  Yuesheng Xu,et al.  Universal Kernels , 2006, J. Mach. Learn. Res..

[33]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.