论文信息 - FMixCutMatch for semi-supervised deep learning - 字舞流文

FMixCutMatch for semi-supervised deep learning

Mixed sample augmentation (MSA) has witnessed great success in the research area of semi-supervised learning (SSL) and is performed by mixing two training samples as an augmentation strategy to effectively smooth the training space. Following the insights on the efficacy of cut-mix in particular, we propose FMixCut, an MSA that combines Fourier space-based data mixing (FMix) and the proposed Fourier space-based data cutting (FCut) for labeled and unlabeled data augmentation. Specifically, for the SSL task, our approach first generates soft pseudo-labels using the model's previous predictions. The model is then trained to penalize the outputs of the FMix-generated samples so that they are consistent with their mixed soft pseudo-labels. In addition, we propose to use FCut, a new Cutout-based data augmentation strategy that adopts the two masked sample pairs from FMix for weighted cross-entropy minimization. Furthermore, by implementing two regularization techniques, namely, batch label distribution entropy maximization and sample confidence entropy minimization, we further boost the training efficiency. Finally, we introduce a dynamic labeled-unlabeled data mixing (DDM) strategy to further accelerate the convergence of the model. Combining the above process, we finally call our SSL approach as "FMixCutMatch", in short FMCmatch. As a result, the proposed FMCmatch achieves state-of-the-art performance on CIFAR-10/100, SVHN and Mini-Imagenet across a variety of SSL conditions with the CNN-13, WRN-28-2 and ResNet-18 networks. In particular, our method achieves a 4.54% test error on CIFAR-10 with 4K labels under the CNN-13 and a 41.25% Top-1 test error on Mini-Imagenet with 10K labels under the ResNet-18. Our codes for reproducing these results are publicly available at https://github.com/biuyq/FMixCutMatch.

Xiang Wei | Weiwei Xing | Wei Lu | Xiaotao Wei | Siyang Lu | Xiangyuan Kong | Weiwei Xing | Xiang Wei | Xiaotao Wei | Xiangyuan Kong | Wei Lu | Siyang Lu

[1] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2] Nojun Kwak,et al. Selective Self-Training for semi-supervised Learning , 2018 .

[3] David Berthelot,et al. MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[4] Tolga Tasdizen,et al. Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[5] Weiwei Xing,et al. An Incremental Self-Labeling Strategy for Semi-Supervised Deep Learning Based on Generative Adversarial Networks , 2020, IEEE Access.

[6] G. McLachlan. Iterative Reclassification Procedure for Constructing An Asymptotically Optimal Rule of Allocation in Discriminant-Analysis , 1975 .

[7] David Berthelot,et al. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[8] Quoc V. Le,et al. Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[9] Yoshua Bengio,et al. Interpolation Consistency Training for Semi-Supervised Learning , 2019, IJCAI.

[10] Jonathon S. Hare,et al. FMix: Enhancing Mixed Sample Data Augmentation , 2020 .

[11] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[12] Harri Valpola,et al. Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[13] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[14] Graham W. Taylor,et al. Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[15] David Berthelot,et al. ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring , 2019, ArXiv.

[16] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[17] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Noel E. O'Connor,et al. Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[19] Xiang Wei,et al. Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect , 2018, ICLR.

[20] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[21] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[23] Jun Zhu,et al. Triple Generative Adversarial Nets , 2017, NIPS.

[24] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[25] R. Tan,et al. Decoupled Certainty-Driven Consistency Loss for Semi-supervised Learning , 2019 .