Label Propagation with Weak Supervision

Semi-supervised learning and weakly supervised learning are important paradigms that aim to reduce the growing demand for labeled data in current machine learning applications. In this paper, we introduce a novel analysis of the classical label propagation algorithm (LPA) (Zhu&Ghahramani, 2002) that moreover takes advantage of useful prior information, specifically probabilistic hypothesized labels on the unlabeled data. We provide an error bound that exploits both the local geometric properties of the underlying graph and the quality of the prior information. We also propose a framework to incorporate multiple sources of noisy information. In particular, we consider the setting of weak supervision, where our sources of information are weak labelers. We demonstrate the ability of our approach on multiple benchmark weakly supervised classification tasks, showing improvements upon existing semi-supervised and weakly supervised methods.

[1]  Daniel Y. Fu,et al.  Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision , 2022, UAI.

[2]  Maria-Florina Balcan,et al.  Learning Predictions for Algorithms with Predictions , 2022, NeurIPS.

[3]  Chidubem Arachie,et al.  Data Consistency for Weakly Supervised Learning , 2022, ArXiv.

[4]  Shuigeng Zhou,et al.  DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples , 2021, NeurIPS.

[5]  Alexander Ratner,et al.  WRENCH: A Comprehensive Benchmark for Weak Supervision , 2021, NeurIPS Datasets and Benchmarks.

[6]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[7]  Jeff Z. HaoChen,et al.  Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss , 2021, NeurIPS.

[8]  J. Lee,et al.  A Theory of Label Propagation for Subpopulation Shift , 2021, ICML.

[9]  Qian Huang,et al.  Combining Label Propagation and Simple Models Out-performs Graph Neural Networks , 2020, ICLR.

[10]  Xiangnan He,et al.  On the Equivalence of Decoupled Graph Convolution Network and Label Propagation , 2020, WWW.

[11]  Colin Wei,et al.  Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data , 2020, ICLR.

[12]  Chidubem Arachie,et al.  Constrained Labeling for Weakly Supervised Learning , 2020, UAI.

[13]  Sergei Vassilvitskii,et al.  Algorithms with predictions , 2020, Beyond the Worst-Case Analysis of Algorithms.

[14]  Sunita Sarawagi,et al.  Learning from Rules Generalizing Labeled Exemplars , 2020, ICLR.

[15]  Christopher R'e,et al.  Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods , 2020, ICML.

[16]  Jure Leskovec,et al.  Unifying Graph Convolutional Neural Networks and Label Propagation , 2020, ArXiv.

[17]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[18]  Bert Huang,et al.  Stochastic Generalized Adversarial Label Learning , 2019 .

[19]  Stephen Lin,et al.  Deep Metric Transfer for Label Propagation with Limited Annotated Data , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[20]  Frederic Sala,et al.  Training Complex Models with Multi-Task Weak Supervision , 2018, AAAI.

[21]  Eunho Yang,et al.  Learning to Propagate Labels: Transductive Propagation Network for Few-Shot Learning , 2018, ICLR.

[22]  Masashi Sugiyama,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[23]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[24]  Yuto Yamaguchi,et al.  When Does Label Propagation Fail? A View from a Network Generative Model , 2017, IJCAI.

[25]  Paul Vernaza,et al.  Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[27]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[28]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[29]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[30]  Christos Faloutsos,et al.  CAMLP: Confidence-Aware Modulated Label Propagation , 2016, SDM.

[31]  Tolga Tasdizen,et al.  Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[32]  Christopher De Sa,et al.  Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Lidong Bing,et al.  Improving Distant Supervision for Information Extraction Using Label Propagation Through Lists , 2015, EMNLP.

[35]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[36]  Philip Bachman,et al.  Learning with Pseudo-Ensembles , 2014, NIPS.

[37]  Masayuki Karasuyama,et al.  Manifold-based Similarity Adaptation for Label Propagation , 2013, NIPS.

[38]  Partha Pratim Talukdar,et al.  Scaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch , 2013, AISTATS.

[39]  Stergios B. Fotopoulos,et al.  All of Nonparametric Statistics , 2007, Technometrics.

[40]  P. Niyogi,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[41]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[42]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[43]  Maria-Florina Balcan,et al.  Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.

[44]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[45]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[46]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[47]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[48]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[49]  Eli Upfal,et al.  Adversarial Multi Class Learning under Weak Supervision with Performance Guarantees , 2021, ICML.

[50]  Eli Upfal,et al.  Semi-Supervised Aggregation of Dependent Weak Supervision Sources With Performance Guarantees , 2021, AISTATS.

[51]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[52]  Quan Pan,et al.  SELP: Semi-supervised evidential label propagation algorithm for graph data clustering , 2018, Int. J. Approx. Reason..

[53]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.