RIM: Reliable Influence-based Active Learning on Graphs

Message passing is the core of most graph models such as Graph Convolutional Network (GCN) and Label Propagation (LP), which usually require a large number of clean labeled data to smooth out the neighborhood over the graph. However, the labeling process can be tedious, costly, and error-prone in practice. In this paper, we propose to unify active learning (AL) and message passing towards minimizing labeling costs, e.g., making use of few and unreliable labels that can be obtained cheaply. We make two contributions towards that end. First, we open up a perspective by drawing a connection between AL enforcing message passing and social influence maximization, ensuring that the selected samples effectively improve the model performance. Second, we propose an extension to the influence model that incorporates an explicit quality factor to model label noise. In this way, we derive a fundamentally new AL selection criterion for GCN and LP–reliable influence maximization (RIM)–by considering quantity and quality of influence simultaneously. Empirical studies on public datasets show that RIM significantly outperforms current AL methods in terms of accuracy and efficiency.

[1]  Juan José Rodríguez Diez,et al.  A weighted voting framework for classifiers ensembles , 2012, Knowledge and Information Systems.

[2]  B. S. Manjunath,et al.  Exploiting Context for Robustness to Label Noise in Active Learning , 2020, ArXiv.

[3]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[4]  Guoren Wang,et al.  Time-Dependent Graphs: Definitions, Applications, and Algorithms , 2019, Data Science and Engineering.

[5]  Xipeng Qiu,et al.  Syntax-guided text generation via graph neural network , 2021, Science China Information Sciences.

[6]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[7]  Roman Garnett,et al.  Σ-Optimality for Active Learning on Gaussian Random Fields , 2013, NIPS.

[8]  Philip S. Yu,et al.  Active Learning: A Survey , 2014, Data Classification: Algorithms and Applications.

[9]  Wenwu Zhu,et al.  Deep Learning on Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[10]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[11]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[12]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[13]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[14]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[15]  Hong Yang,et al.  Active Discriminative Network Representation Learning , 2018, IJCAI.

[16]  Kyomin Jung,et al.  IRIE: Scalable and Robust Influence Maximization in Social Networks , 2011, 2012 IEEE 12th International Conference on Data Mining.

[17]  Kevin Chen-Chuan Chang,et al.  Active Learning for Graph Embedding , 2017, ArXiv.

[18]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[19]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[20]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[21]  Jian Yu,et al.  Important sampling based active learning for imbalance classification , 2020, Science China Information Sciences.

[22]  Enhong Chen,et al.  On Approximation of Real-World Influence Spread , 2012, ECML/PKDD.

[23]  Yuxiao Dong,et al.  Microsoft Academic Graph: When experts are not enough , 2020, Quantitative Science Studies.

[24]  Jun Du,et al.  Active Learning with Human-Like Noisy Oracle , 2010, 2010 IEEE International Conference on Data Mining.

[25]  Peng Cui,et al.  On the Equivalence of Decoupled Graph Convolution Network and Label Propagation , 2021, WWW.

[26]  Shiwen Wu,et al.  Graph Neural Networks in Recommender Systems: A Survey , 2020, ArXiv.

[27]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[28]  Andreas Krause,et al.  Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization , 2013, ICML.

[29]  Jianping Yin,et al.  Graph-Based Active Learning Based on Label Propagation , 2008, MDAI.

[30]  Marc-Alexandre Côté,et al.  Graph Policy Network for Transferable Active Learning on Graphs , 2020, NeurIPS.

[31]  Antanas Verikas,et al.  Agreeing to disagree: active learning with noisy labels without crowdsourcing , 2017, International Journal of Machine Learning and Cybernetics.

[32]  Lei Chen,et al.  ALG: Fast and Accurate Active Learning Framework for Graph Convolutional Networks , 2021, SIGMOD Conference.

[33]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[34]  P. Alam ‘A’ , 2021, Composites Engineering: An A–Z Guide.

[35]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[36]  Zhi Yang,et al.  Grain: Improving Data Efficiency of Graph Neural Networks via Diversified Influence Maximization , 2021, Proc. VLDB Endow..

[37]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[39]  Kamalika Chaudhuri,et al.  Active Learning from Weak and Strong Labelers , 2015, NIPS.

[40]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[41]  Kaigui Bian,et al.  GARG: Anonymous Recommendation of Point-of-Interest in Mobile Networks by Graph Convolution Network , 2020, Data Science and Engineering.

[42]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[43]  Bo Xu,et al.  Using Active Learning to Improve Distantly Supervised Entity Typing in Multi-source Knowledge Bases , 2020, NLPCC.

[44]  Yuchen Li,et al.  Influence Maximization on Social Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[45]  Jiawei Jiang,et al.  OpenBox: A Generalized Black-box Optimization Service , 2021, KDD.

[46]  Jure Leskovec,et al.  Unifying Graph Convolutional Neural Networks and Label Propagation , 2020, ArXiv.

[47]  Robert D. Nowak,et al.  S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification , 2015, COLT.