Fisher-Bures Adversary Graph Convolutional Networks

In a graph convolutional network, we assume that the graph $G$ is generated wrt some observation noise. During learning, we make small random perturbations $\Delta{}G$ of the graph and try to improve generalization. Based on quantum information geometry, $\Delta{}G$ can be characterized by the eigendecomposition of the graph Laplacian matrix. We try to minimize the loss wrt the perturbed $G+\Delta{G}$ while making $\Delta{G}$ to be effective in terms of the Fisher information of the neural network. Our proposed model can consistently improve graph convolutional networks on semi-supervised node classification tasks with reasonable computational overhead. We present three different geometries on the manifold of graphs: the intrinsic geometry measures the information theoretic dynamics of a graph; the extrinsic geometry characterizes how such dynamics can affect externally a graph neural network; the embedding geometry is for measuring node embeddings. These new analytical tools are useful in developing a good understanding of graph neural networks and fostering new techniques.

[1]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[2]  M. Hübner Explicit computation of the Bures distance for density matrices , 1992 .

[3]  D. Bures An extension of Kakutani’s theorem on infinite product measures to the tensor product of semifinite *-algebras , 1969 .

[4]  N. Čencov Statistical Decision Rules and Optimal Inference , 2000 .

[5]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[6]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[7]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[9]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[10]  Marco Cuturi,et al.  Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions , 2018, NeurIPS.

[11]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[12]  Guodong Zhang,et al.  Noisy Natural Gradient as Variational Inference , 2017, ICML.

[13]  Yaxin Peng,et al.  The Adversarial Attack and Detection under the Fisher Information Metric , 2018, AAAI.

[14]  S. Severini,et al.  The Laplacian of a Graph as a Density Matrix: A Basic Combinatorial Approach to Separability of Mixed States , 2004, quant-ph/0406165.

[15]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[16]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[19]  Jonathan Masci,et al.  Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[21]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[22]  Chris Clifton,et al.  Classifier evaluation and attribute selection against active adversaries , 2010, Data Mining and Knowledge Discovery.

[23]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[24]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[25]  Blaine Nelson,et al.  Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[26]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[27]  Frank Nielsen,et al.  Matrix Information Geometry , 2012 .

[28]  Kristina Lerman,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[29]  Cameron Musco,et al.  Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition , 2015, NIPS.

[30]  Razvan Pascanu,et al.  Revisiting Natural Gradient for Deep Networks , 2013, ICLR.

[31]  J. Urgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[32]  Florent Krzakala,et al.  Spectral Clustering of graphs with the Bethe Hessian , 2014, NIPS.

[33]  Claudia Eckert,et al.  Support vector machines under adversarial label contamination , 2015, Neurocomputing.

[34]  Shun-ichi Amari,et al.  Fisher Information and Natural Gradient Learning of Random Deep Networks , 2018, AISTATS.

[35]  Frank Nielsen,et al.  Relative Fisher Information and Natural Gradient for Learning Large Modular Models , 2017, ICML.

[36]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[37]  Frank Nielsen,et al.  Mining Matrix Data with Bregman Matrix Divergences for Portfolio Selection , 2013 .

[38]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[39]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[40]  F. Opitz Information geometry and its applications , 2012, 2012 9th European Radar Conference.

[41]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[42]  Jaroslaw Adam Miszczak,et al.  Quantum state discrimination: A geometric approach , 2007, Physical Review A.

[43]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[44]  Zhizhen Zhao,et al.  LanczosNet: Multi-Scale Deep Graph Convolutional Networks , 2019, ICLR.

[45]  R. Bhatia,et al.  On the Bures–Wasserstein distance between positive definite matrices , 2017, Expositiones Mathematicae.

[46]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[47]  Ke Sun,et al.  An Information Geometry of Statistical Manifold Learning , 2014, ICML.

[48]  Cordelia Schmid,et al.  Convolutional Kernel Networks , 2014, NIPS.

[49]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.