论文信息 - Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift

Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift

Recent work has shown that using unlabeled data in semi-supervised learning is not always beneficial and can even hurt generalization, especially when there is a class mismatch between the unlabeled and labeled examples. We investigate this phenomenon for image classification on the CIFAR-10 and the ImageNet datasets, and with many other forms of domain shifts applied (e.g. salt-and-pepper noise). Our main contribution is Split Batch Normalization (Split-BN), a technique to improve SSL when the additional unlabeled data comes from a shifted distribution. We achieve it by using separate batch normalization statistics for unlabeled examples. Due to its simplicity, we recommend it as a standard practice. Finally, we analyse how domain shift affects the SSL training process. In particular, we find that during training the statistics of hidden activations in late layers become markedly different between the unlabeled and the labeled examples.

Stanislaw Jastrzebski | Michal Zajac | Konrad Zolna

[1] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[2] Lina Yao,et al. Distributionally Robust Semi-Supervised Learning for People-Centric Sensing , 2018, AAAI.

[3] Matthias Bethge,et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[4] Barbara Plank,et al. Strong Baselines for Neural Semi-Supervised Learning under Domain Shift , 2018, ACL.

[5] Yoshua Bengio,et al. Measuring the tendency of CNNs to Learn Surface Statistical Regularities , 2017, ArXiv.

[6] Jiaying Liu,et al. Revisiting Batch Normalization For Practical Domain Adaptation , 2016, ICLR.

[7] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.

[8] Harri Valpola,et al. Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[9] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[10] Mubarak Shah,et al. Training Faster by Separating Modes of Variation in Batch-Normalized Models , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11] Fabio Gagliardi Cozman,et al. Semi-Supervised Learning of Mixture Models , 2003, ICML.