Data-Adaptive Active Sampling for Efficient Graph-Cognizant Classification

This paper deals with active sampling of graph nodes representing training data for binary classification. The graph may be given or constructed using similarity measures among nodal features. Leveraging the graph for classification builds on the premise that labels across neighboring nodes are correlated according to a categorical Markov random field (MRF). This model is further relaxed to a Gaussian (G)MRF with labels taking continuous values—an approximation that not only mitigates the combinatorial complexity of the categorical model, but also offers optimal unbiased soft predictors of the unlabeled nodes. The proposed sampling strategy is based on querying the node whose label disclosure is expected to inflict the largest change on the GMRF, and in this sense it is the most informative on average. Connections are established to other sampling methods including uncertainty sampling, variance minimization, and sampling based on the $\Sigma \text{-}$ optimality criterion. A simple yet effective heuristic is also introduced for increasing the exploration capabilities of the sampler, and reducing bias of the resultant classifier, by adjusting the confidence on the model label predictions. The novel sampling strategies are based on quantities that are readily available without the need for model retraining, rendering them computationally efficient and scalable to large graphs. Numerical tests using synthetic and real data demonstrate that the proposed methods achieve accuracy that is comparable or superior to the state of the art even at reduced runtime.

[1]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[2]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[3]  Joachim M. Buhmann,et al.  Active learning for semantic segmentation with expected change , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Georgios B. Giannakis,et al.  Multi-kernel based nonlinear models for connectivity identification of brain networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[7]  Robert D. Nowak,et al.  Graph-based active learning: A new look at expected error minimization , 2016, 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[8]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[9]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[10]  Robert D. Nowak,et al.  Distilled Sensing: Adaptive Sampling for Sparse Detection and Estimation , 2010, IEEE Transactions on Information Theory.

[11]  Roman Garnett,et al.  Σ-Optimality for Active Learning on Gaussian Random Fields , 2013, NIPS.

[12]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[13]  Claudio Gentile,et al.  Active Learning on Trees and Graphs , 2010, COLT.

[14]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[15]  Jiawei Han,et al.  A Variance Minimization Criterion to Active Learning on Graphs , 2012, AISTATS.

[16]  Peter Bühlmann,et al.  Two Optimal Strategies for Active Learning of Causal Models from Interventions , 2012, ArXiv.

[17]  Paul N. Bennett,et al.  Active Sampling of Networks , 2012 .

[18]  Shiliang Sun,et al.  Active learning of Gaussian processes with manifold-preserving graph reduction , 2014, Neural Computing and Applications.

[19]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[20]  Jianping Yin,et al.  Graph-Based Active Learning Based on Label Propagation , 2008, MDAI.

[21]  Joachim Denzler,et al.  Selecting Influential Examples: Active Learning with Expected Model Output Changes , 2014, ECCV.

[22]  Georgios B. Giannakis,et al.  Kernel-Based Structural Equation Models for Topology Identification of Directed Networks , 2016, IEEE Transactions on Signal Processing.

[23]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[24]  Bin Li,et al.  A survey on instance selection for active learning , 2012, Knowledge and Information Systems.

[25]  Antonio Ortega,et al.  Active learning on weighted graphs using adaptive and non-adaptive approaches , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  Jiawei Han,et al.  Towards Active Learning on Graphs: An Error Bound Minimization Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[27]  Hisashi Kashima,et al.  Budgeted stream-based active learning via adaptive submodular maximization , 2016, NIPS.

[28]  Takeo Kanade,et al.  Active sample selection and correction propagation on a gradually-augmented graph , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data , 2009 .

[30]  Peter Bühlmann,et al.  Two optimal strategies for active learning of causal models from interventional data , 2012, Int. J. Approx. Reason..

[31]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[32]  Charu C. Aggarwal,et al.  Selective sampling on graphs for classification , 2013, KDD.

[33]  Peter Kaiser,et al.  Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning , 2009, PLoS Comput. Biol..