论文信息 - Adversarial Feature Distribution Alignment for Semi-Supervised Learning

Adversarial Feature Distribution Alignment for Semi-Supervised Learning

Training deep neural networks with only a few labeled samples can lead to overfitting. This is problematic in semi-supervised learning where only a few labeled samples are available. In this paper, we show that a consequence of overfitting in SSL is feature distribution misalignment between labeled and unlabeled samples. Hence, we propose a new feature distribution alignment method. Our method is particularly effective when using only a small amount of labeled samples. We test our method on CIFAR10 and SVHN. On SVHN we achieve a test error of 3.88% (250 labeled samples) and 3.39% (1000 labeled samples) which is close to the fully supervised model 2.89% (73k labeled samples). In comparison, the current SOTA achieves only 4.29% and 3.74%. Finally, we provide a theoretical insight why feature distribution alignment occurs and show that our method reduces it.

[1] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[2] Yoshua Bengio,et al. Interpolation Consistency Training for Semi-Supervised Learning , 2019, IJCAI.

[3] Bo Zhang,et al. Smooth Neighbors on Teacher Graphs for Semi-Supervised Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Trevor Darrell,et al. Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[6] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[7] Tapani Raiko,et al. Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[8] Trevor Darrell,et al. Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Silvio Savarese,et al. Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[10] Chong-Wah Ngo,et al. Transferrable Prototypical Networks for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[12] Radu Timofte,et al. Adversarial Sampling for Active Learning , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[13] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] O. Chapelle,et al. Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews] , 2009, IEEE Transactions on Neural Networks.

[15] Yoshua Bengio,et al. Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[16] Stefano Soatto,et al. SaaS: Speed as a Supervisor for Semi-supervised Learning , 2018, ECCV.

[17] Xavier Gastaldi,et al. Shake-Shake regularization , 2017, ArXiv.

[18] Junzhou Huang,et al. Progressive Feature Alignment for Unsupervised Domain Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[20] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[21] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[22] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[23] Matthieu Cord,et al. HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning , 2018, ECCV.

[24] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[25] Yinda Zhang,et al. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[26] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[27] Harri Valpola,et al. Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[28] Colin Raffel,et al. Realistic Evaluation of Deep Semi-Supervised Learning Algorithms , 2018, NeurIPS.

[29] Fei-Fei Li,et al. What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[30] Geoffrey French,et al. Self-ensembling for visual domain adaptation , 2017, ICLR.

[31] Fan Yang,et al. Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[32] Tolga Tasdizen,et al. Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[33] Jun Zhu,et al. Triple Generative Adversarial Nets , 2017, NIPS.

[34] Dong-Hyun Lee,et al. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[35] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[36] Radu Timofte,et al. Towards Closing the Gap in Weakly Supervised Semantic Segmentation with DCNNs: Combining Local and Global Models , 2018, ArXiv.

[37] Jost Tobias Springenberg,et al. Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks , 2015, ICLR.

[38] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[39] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[40] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[41] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.

[42] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[43] Pedro H. O. Pinheiro,et al. Unsupervised Domain Adaptation with Similarity Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44] Zhanxing Zhu,et al. Tangent-Normal Adversarial Regularization for Semi-Supervised Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Gábor Lugosi,et al. Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[47] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..