Semi-supervised Adversarial Active Learning on Attributed Graphs

Active learning (AL) on attributed graphs has received increasing attention with the prevalence of graphstructured data. Although AL has been widely studied for alleviating label sparsity issues with the conventional independent and identically distributed (i.i.d.) data, how to make it effective over attributed graphs remains an open research question. Existing AL algorithms on graphs attempt to reuse the classic AL query strategies designed for i.i.d. data. However, they suffer from two major limitations. First, different AL query strategies calculated in distinct scoring spaces are often naively combined to determine which nodes to be labeled. Second, the AL query engine and the learning of the classifier are treated as two separating processes, resulting in unsatisfactory performance. In this paper, we propose a SEmi-supervised Adversarial active Learning (SEAL) framework on attributed graphs, which fully leverages the representation power of deep neural networks and devises a novel AL query strategy in an adversarial way. Our framework learns two adversarial components; a graph embedding network that encodes both the unlabeled and labeled nodes into a latent space, expecting to trick the discriminator to regard all nodes as already labeled, and a semi-supervised discriminator network that distinguishes the unlabeled from the existing labeled nodes in the latent space. The divergence score, generated by the discriminator in a unified latent space, serves as the informativeness measure to actively select the most informative node to be labeled by an oracle. The two adversarial components form a closed loop to mutually and simultaneously reinforce each other towards enhancing the active learning performance. Extensive experiments on four real-world networks validate the effectiveness of the SEAL framework with superior performance improvements to state-of-the-art baselines.

[1]  Jianping Yin,et al.  Graph-Based Active Learning Based on Label Propagation , 2008, MDAI.

[2]  Jie Tang,et al.  Combining link and content for collective active learning , 2010, CIKM.

[3]  Kevin Chen-Chuan Chang,et al.  Active Learning for Graph Embedding , 2017, ArXiv.

[4]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[5]  Chengqi Zhang,et al.  Attributed network embedding via subspace discovery , 2019, Data Mining and Knowledge Discovery.

[6]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[7]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[8]  Georgios B. Giannakis,et al.  Data-Adaptive Active Sampling for Efficient Graph-Cognizant Classification , 2017, IEEE Transactions on Signal Processing.

[9]  Kentaro Inui,et al.  Selective Sampling for Example-based Word Sense Disambiguation , 1998, CL.

[10]  Jiawei Han,et al.  Towards Active Learning on Graphs: An Error Bound Minimization Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[11]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[12]  Nikolaos Papanikolopoulos,et al.  Scalable Active Learning for Multiclass Image Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  William Speier,et al.  Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach , 2019, CVPR Workshops.

[14]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[15]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[16]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[17]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[18]  Jeff A. Bilmes,et al.  Label Selection on Graphs , 2009, NIPS.

[19]  Roman Garnett,et al.  Σ-Optimality for Active Learning on Gaussian Random Fields , 2013, NIPS.

[20]  Jie Tang,et al.  Batch Mode Active Learning for Networked Data , 2012, TIST.

[21]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[22]  En Zhu,et al.  A Scalable Algorithm for Graph-Based Active Learning , 2008, FAW.

[23]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[24]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[25]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[26]  Oriol Vinyals,et al.  Towards Principled Unsupervised Learning , 2015, ArXiv.

[27]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[28]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[29]  Ming-Wei Chang,et al.  Learning and Inference with Constraints , 2008, AAAI.

[30]  Jiawei Han,et al.  A Variance Minimization Criterion to Active Learning on Graphs , 2012, AISTATS.

[31]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[32]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[33]  Shiliang Sun,et al.  Active learning with extremely sparse labeled examples , 2010, Neurocomputing.

[34]  David Haussler,et al.  Proceedings of the fifth annual workshop on Computational learning theory , 1992, COLT 1992.

[35]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[36]  Charu C. Aggarwal,et al.  Feature Selection for Classification: A Review , 2014, Data Classification: Algorithms and Applications.

[37]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[38]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[39]  Hong Yang,et al.  Active Discriminative Network Representation Learning , 2018, IJCAI.

[40]  Georgios B. Giannakis,et al.  Active sampling for graph-aware classification , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[41]  Jeff A. Bilmes,et al.  Active Semi-Supervised Learning using Submodular Functions , 2011, UAI.