Wasserstein Embedding for Graph Learning

We present Wasserstein Embedding for Graph Learning (WEGL), a novel and fast framework for embedding entire graphs in a vector space, in which various machine learning models are applicable for graph-level prediction tasks. We leverage new insights on defining similarity between graphs as a function of the similarity between their node embedding distributions. Specifically, we use the Wasserstein distance to measure the dissimilarity between node embeddings of different graphs. Different from prior work, we avoid pairwise calculation of distances between graphs and reduce the computational complexity from quadratic to linear in the number of graphs. WEGL calculates Monge maps from a reference distribution to each node embedding and, based on these maps, creates a fixed-sized vector representation of the graph. We evaluate our new graph embedding approach on various benchmark graph-property prediction tasks, showing state-of-the-art classification performance, while having superior computational efficiency.

[1]  Alexandre d'Aspremont,et al.  A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention , 2020, ICLR.

[2]  Liwei Wang,et al.  GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training , 2020, ICML.

[3]  A. Cloninger,et al.  Linear Optimal Transport Embedding: Provable fast Wasserstein distance computation and classification for nonlinear problems , 2020, ArXiv.

[4]  Frank Weichert,et al.  Hierarchical Inter-Message Passing for Learning on Molecular Graphs , 2020, ArXiv.

[5]  Bernard Ghanem,et al.  DeeperGCN: All You Need to Train Deeper GCNs , 2020, ArXiv.

[6]  Regina Barzilay,et al.  Optimal Transport Graph Neural Networks , 2020, ArXiv.

[7]  Tom B. Brown,et al.  Measuring the Algorithmic Efficiency of Neural Networks , 2020, ArXiv.

[8]  Christopher Ré,et al.  Machine Learning on Graphs: A Model and Comprehensive Taxonomy , 2020, J. Mach. Learn. Res..

[9]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[10]  Guosheng Lin,et al.  DeepEMD: Few-Shot Image Classification With Differentiable Earth Mover’s Distance and Structured Classifiers , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yihe Dong,et al.  COPT: Coordinated Optimal Transport on Graphs , 2020, NeurIPS.

[12]  Yihe Dong,et al.  COPT: Coordinated Optimal Transport for Graph Sketching , 2020 .

[13]  Mark Eisen,et al.  Wireless Power Control via Counterfactual Optimization of Graph Neural Networks , 2020, 2020 IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[14]  A. Micheli,et al.  A Fair Comparison of Graph Neural Networks for Graph Classification , 2019, ICLR.

[15]  Giovanni Chierchia,et al.  GOT: An Optimal Transport framework for Graph comparison , 2019, NeurIPS.

[16]  Karsten M. Borgwardt,et al.  Wasserstein Weisfeiler-Lehman Graph Kernels , 2019, NeurIPS.

[17]  Ruosong Wang,et al.  Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels , 2019, NeurIPS.

[18]  J. Leskovec,et al.  Strategies for Pre-training Graph Neural Networks , 2019, ICLR.

[19]  Karsten M. Borgwardt,et al.  A Persistent Weisfeiler-Lehman Procedure for Graph Classification , 2019, ICML.

[20]  Petra Mutzel,et al.  Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings , 2019, NeurIPS.

[21]  Nils M. Kriege,et al.  A survey on graph kernels , 2019, Applied Network Science.

[22]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[23]  Martin Grohe,et al.  Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks , 2018, AAAI.

[24]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[25]  Lihui Chen,et al.  Capsule Graph Neural Network , 2018, ICLR.

[26]  Yijian Xiang,et al.  RetGK: Graph Kernels based on Return Probabilities of Random Walks , 2018, NeurIPS.

[27]  Michalis Vazirgiannis,et al.  A Degeneracy Framework for Graph Similarity , 2018, IJCAI.

[28]  Sergey Ivanov,et al.  Anonymous Walk Embeddings , 2018, ICML.

[29]  Nicolas Courty,et al.  Optimal Transport for structured data with application on graphs , 2018, ICML.

[30]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[31]  Amir Asif,et al.  Distributed-Graph-Based Statistical Approach for Intrusion Detection in Cyber-Physical Systems , 2018, IEEE Transactions on Signal and Information Processing over Networks.

[32]  Alexander Gasnikov,et al.  Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn's Algorithm , 2018, ICML.

[33]  Kristian Kersting,et al.  Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[34]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[35]  Nicolas Courty,et al.  Learning Wasserstein Embeddings , 2017, ICLR.

[36]  Jure Leskovec,et al.  Large-Scale Analysis of Disease Pathways in the Human Interactome , 2017, bioRxiv.

[37]  Regina Barzilay,et al.  Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network , 2017, NIPS.

[38]  Gustavo K. Rohde,et al.  Optimal Mass Transport: Signal processing and machine-learning applications , 2017, IEEE Signal Processing Magazine.

[39]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[40]  Montacer Essid,et al.  Quadratically-Regularized Optimal Transport on Graphs , 2017, SIAM J. Sci. Comput..

[41]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[42]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[43]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[44]  Nils M. Kriege,et al.  On Valid Optimal Assignment Kernels and Applications to Graph Classification , 2016, NIPS.

[45]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[46]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[47]  Gustavo K. Rohde,et al.  A continuous linear optimal transport approach for pattern analysis in image datasets , 2016, Pattern Recognit..

[48]  Risi Kondor,et al.  The Multiscale Laplacian Graph Kernel , 2016, NIPS.

[49]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[50]  Marco Cuturi,et al.  Principal Geodesic Analysis for Probability Measures under the Optimal Transport Metric , 2015, NIPS.

[51]  Kristian Kersting,et al.  Explicit Versus Implicit Graph Feature Maps: A Computational Phase Transition for Walk Kernels , 2014, 2014 IEEE International Conference on Data Mining.

[52]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[53]  Christian L'eonard Lazy random walks and optimal transport on graphs , 2013, 1308.0226.

[54]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[55]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[56]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[57]  C. Villani Optimal Transport: Old and New , 2008 .

[58]  B. Schölkopf,et al.  Kernel methods in machine learning , 2007, math/0701907.

[59]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[60]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[61]  L. Breiman Random Forests , 2001, Machine Learning.

[62]  Y. Brenier Polar Factorization and Monotone Rearrangement of Vector-Valued Functions , 1991 .

[63]  Aleksandar Bojchevski,et al.  Is PageRank All You Need for Scalable Graph Neural Networks ? , 2019 .

[64]  Gilles Louppe,et al.  Independent consultant , 2013 .

[65]  Gustavo K. Rohde,et al.  A Linear Optimal Transportation Framework for Quantifying and Visualizing Variations in Sets of Images , 2012, International Journal of Computer Vision.

[66]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[67]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.