Learning by Sampling and Compressing: Efficient Graph Representation Learning with Extremely Limited Annotations

Graph convolution network (GCN) attracts intensive research interest with broad applications. While existing work mainly focused on designing novel GCN architectures for better performance, few of them studied a practical yet challenging problem: How to learn GCNs from data with extremely limited annotation? In this paper, we propose a new learning method by sampling strategy and model compression to overcome this challenge. Our approach has multifold advantages: 1) the adaptive sampling strategy largely suppresses the GCN training deviation over uniform sampling; 2) compressed GCN-based methods with a smaller scale of parameters need fewer labeled data to train; 3) the smaller scale of training data is beneficial to reduce the human resource cost to label them. We choose six popular GCN baselines and conduct extensive experiments on three real-world datasets. The results show that by applying our method, all GCN baselines cut down the annotation requirement by as much as 90$\%$ and compress the scale of parameters more than 6$\times$ without sacrificing their strong performance. It verifies that the training method could extend the existing semi-supervised GCN-based methods to the scenarios with the extremely small scale of labeled data.

[1]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[2]  Alexander Novikov,et al.  Tensorizing Neural Networks , 2015, NIPS.

[3]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[4]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[5]  Philip S. Yu,et al.  SEAL: Learning Heuristics for Community Detection with Generative Adversarial Networks , 2020, KDD.

[6]  Xiaoming Liu,et al.  MIRACLE: A multiple independent random walks community parallel detection algorithm for big graphs , 2016, J. Netw. Comput. Appl..

[7]  Donald F. Towsley,et al.  Sampling online social networks by random walk with indirect jumps , 2017, Data Mining and Knowledge Discovery.

[8]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[9]  Xiaoming Liu,et al.  Detecting community structure for undirected big graphs based on random walks , 2014, WWW.

[10]  Anne-Marie Kermarrec,et al.  Peer counting and sampling in overlay networks: random walk methods , 2006, PODC '06.

[11]  Donald F. Towsley,et al.  Estimating and sampling graphs with multidimensional random walks , 2010, IMC '10.

[12]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[13]  H Robbins,et al.  Complete Convergence and the Law of Large Numbers. , 1947, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[15]  Wolfgang Hackbusch,et al.  An Introduction to Hierarchical (H-) Rank and TT-Rank of Tensors with Examples , 2011, Comput. Methods Appl. Math..

[16]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[17]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[18]  Jaewoo Kang,et al.  Graph Transformer Networks , 2019, NeurIPS.

[19]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[20]  Donald F. Towsley,et al.  Sampling directed graphs with random walks , 2012, 2012 Proceedings IEEE INFOCOM.

[21]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[22]  Jure Leskovec,et al.  Pre-training Graph Neural Networks , 2019, ArXiv.

[23]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[24]  Donald F. Towsley,et al.  Improving Random Walk Estimation Accuracy with Uniform Restarts , 2010, WAW.

[25]  Xin Xu,et al.  A general framework of hybrid graph sampling for complex network analysis , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[26]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[27]  Bart Selman,et al.  Towards Efficient Sampling: Exploiting Random Walk Strategies , 2004, AAAI.

[28]  Stephan Günnemann,et al.  Predict then Propagate: Graph Neural Networks meet Personalized PageRank , 2018, ICLR.

[29]  Soummya Kar,et al.  Topology adaptive graph convolutional networks , 2017, ArXiv.

[30]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.