FMixCutMatch for semi-supervised deep learning

Mixed sample augmentation (MSA) has witnessed great success in the research area of semi-supervised learning (SSL) and is performed by mixing two training samples as an augmentation strategy to effectively smooth the training space. Following the insights on the efficacy of cut-mix in particular, we propose FMixCut, an MSA that combines Fourier space-based data mixing (FMix) and the proposed Fourier space-based data cutting (FCut) for labeled and unlabeled data augmentation. Specifically, for the SSL task, our approach first generates soft pseudo-labels using the model's previous predictions. The model is then trained to penalize the outputs of the FMix-generated samples so that they are consistent with their mixed soft pseudo-labels. In addition, we propose to use FCut, a new Cutout-based data augmentation strategy that adopts the two masked sample pairs from FMix for weighted cross-entropy minimization. Furthermore, by implementing two regularization techniques, namely, batch label distribution entropy maximization and sample confidence entropy minimization, we further boost the training efficiency. Finally, we introduce a dynamic labeled-unlabeled data mixing (DDM) strategy to further accelerate the convergence of the model. Combining the above process, we finally call our SSL approach as "FMixCutMatch", in short FMCmatch. As a result, the proposed FMCmatch achieves state-of-the-art performance on CIFAR-10/100, SVHN and Mini-Imagenet across a variety of SSL conditions with the CNN-13, WRN-28-2 and ResNet-18 networks. In particular, our method achieves a 4.54% test error on CIFAR-10 with 4K labels under the CNN-13 and a 41.25% Top-1 test error on Mini-Imagenet with 10K labels under the ResNet-18. Our codes for reproducing these results are publicly available at https://github.com/biuyq/FMixCutMatch.

[1]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2]  Nojun Kwak,et al.  Selective Self-Training for semi-supervised Learning , 2018 .

[3]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[4]  Tolga Tasdizen,et al.  Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[5]  Weiwei Xing,et al.  An Incremental Self-Labeling Strategy for Semi-Supervised Deep Learning Based on Generative Adversarial Networks , 2020, IEEE Access.

[6]  G. McLachlan Iterative Reclassification Procedure for Constructing An Asymptotically Optimal Rule of Allocation in Discriminant-Analysis , 1975 .

[7]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[8]  Quoc V. Le,et al.  Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[9]  Yoshua Bengio,et al.  Interpolation Consistency Training for Semi-Supervised Learning , 2019, IJCAI.

[10]  Jonathon S. Hare,et al.  FMix: Enhancing Mixed Sample Data Augmentation , 2020 .

[11]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[12]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[13]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[14]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[15]  David Berthelot,et al.  ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring , 2019, ArXiv.

[16]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[17]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Noel E. O'Connor,et al.  Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[19]  Xiang Wei,et al.  Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect , 2018, ICLR.

[20]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[21]  Quoc V. Le,et al.  Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[23]  Jun Zhu,et al.  Triple Generative Adversarial Nets , 2017, NIPS.

[24]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[25]  R. Tan,et al.  Decoupled Certainty-Driven Consistency Loss for Semi-supervised Learning , 2019 .