SEAL: Semisupervised Adversarial Active Learning on Attributed Graphs

Active learning (AL) on attributed graphs has received increasing attention with the prevalence of graph-structured data. Although AL has been widely studied for alleviating label sparsity issues with the conventional nonrelational data, how to make it effective over attributed graphs remains an open research question. Existing AL algorithms on node classification attempt to reuse the classic AL query strategies designed for nonrelational data. However, they suffer from two major limitations. First, different AL query strategies calculated in distinct scoring spaces are often naively combined to determine which nodes to be labeled. Second, the AL query engine and the learning of the classifier are treated as two separating processes, resulting in unsatisfactory performance. In this article, we propose a SEmisupervised Adversarial active Learning (SEAL) framework on attributed graphs, which fully leverages the representation power of deep neural networks and devises a novel AL query strategy for node classification in an adversarial way. Our framework learns two adversarial components; a graph embedding network that encodes both the unlabeled and labeled nodes into a common latent space, expecting to trick the discriminator to regard all nodes as already labeled, and a semisupervised discriminator network that distinguishes the unlabeled from the existing labeled nodes. The divergence score, generated by the discriminator in a unified latent space, serves as the informativeness measure to actively select the most informative node to be labeled by an oracle. The two adversarial components form a closed loop to mutually and simultaneously reinforce each other toward enhancing the AL performance. Extensive experiments on real-world networks validate the effectiveness of the SEAL framework with superior performance improvements to state-of-the-art baselines on node classification tasks.

[1]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[2]  Trevor Darrell,et al.  Variational Adversarial Active Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[4]  William Speier,et al.  Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach , 2019, CVPR Workshops.

[5]  Georgios B. Giannakis,et al.  Data-Adaptive Active Sampling for Efficient Graph-Cognizant Classification , 2017, IEEE Transactions on Signal Processing.

[6]  Hongxia Jin,et al.  Adversarial Active Learning for Sequences Labeling and Generation , 2018, IJCAI.

[7]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[8]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[9]  Shiliang Sun,et al.  Active learning with extremely sparse labeled examples , 2010, Neurocomputing.

[10]  Jie Tang,et al.  Batch Mode Active Learning for Networked Data , 2012, TIST.

[11]  En Zhu,et al.  A Scalable Algorithm for Graph-Based Active Learning , 2008, FAW.

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[14]  Jeff A. Bilmes,et al.  Label Selection on Graphs , 2009, NIPS.

[15]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[16]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[17]  Kentaro Inui,et al.  Selective Sampling for Example-based Word Sense Disambiguation , 1998, CL.

[18]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[19]  Nikolaos Papanikolopoulos,et al.  Scalable Active Learning for Multiclass Image Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[21]  Ralf Klinkenberg,et al.  Data Classification: Algorithms and Applications , 2014 .

[22]  Jiawei Han,et al.  Towards Active Learning on Graphs: An Error Bound Minimization Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[23]  Bin Li,et al.  A survey on instance selection for active learning , 2012, Knowledge and Information Systems.

[24]  Hong Yang,et al.  Active Discriminative Network Representation Learning , 2018, IJCAI.

[25]  Roman Garnett,et al.  Σ-Optimality for Active Learning on Gaussian Random Fields , 2013, NIPS.

[26]  Oriol Vinyals,et al.  Towards Principled Unsupervised Learning , 2015, ArXiv.

[27]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[28]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[29]  Ming-Wei Chang,et al.  Learning and Inference with Constraints , 2008, AAAI.

[30]  Chengqi Zhang,et al.  Attributed network embedding via subspace discovery , 2019, Data Mining and Knowledge Discovery.

[31]  L. Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[32]  Chengqi Zhang,et al.  Active Class Discovery and Learning for Networked Data , 2013, SDM.

[33]  Jie Tang,et al.  Combining link and content for collective active learning , 2010, CIKM.

[34]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[35]  Le Wu,et al.  Deep Attributed Network Embedding by Preserving Structure and Attribute Information , 2021, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[36]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[37]  Jianping Yin,et al.  Graph-Based Active Learning Based on Label Propagation , 2008, MDAI.

[38]  Kevin Chen-Chuan Chang,et al.  Active Learning for Graph Embedding , 2017, ArXiv.

[39]  Georgios B. Giannakis,et al.  Active sampling for graph-aware classification , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[40]  Jeff A. Bilmes,et al.  Active Semi-Supervised Learning using Submodular Functions , 2011, UAI.

[41]  Jiawei Han,et al.  A Variance Minimization Criterion to Active Learning on Graphs , 2012, AISTATS.

[42]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[43]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[44]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[45]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[46]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.