Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data

Deep semi-supervised learning (SSL) has been recently shown very effectively. However, its performance is seriously decreased when the class distribution is mismatched, among which a common situation is that unlabeled data contains some classes not seen in the labeled data. Efforts on this issue remain to be limited. This paper proposes a simple and effective safe deep SSL method to alleviate the harm caused by it. In theory, the result learned from the new method is never worse than learning from merely labeled data, and it is theoretically guaranteed that its generalization approaches the optimal in the order O( √ d ln(n)/n), even faster than the convergence rate in supervised learning associated with massive parameters. In the experiment of benchmark data, unlike the existing deep SSL methods which are no longer as good as supervised learning in 40% of unseen-class unlabeled data, the new method can still achieve performance gain in more than 60% of unseen-class unlabeled data. Moreover, the proposal is suitable for many deep SSL algorithms and can be easily extended to handle other cases of class distribution mismatch.

[1]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[2]  Kaizhu Huang,et al.  Maximum margin semi-supervised learning with irrelevant data , 2015, Neural Networks.

[3]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[4]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[5]  Marco Loog,et al.  Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Yoshua Bengio,et al.  Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[7]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[8]  Lan-Zhe Guo,et al.  A General Formulation for Safely Exploiting Weakly Supervised Data , 2018, AAAI.

[9]  Marco Loog,et al.  Improvability Through Semi-Supervised Learning: A Survey of Theoretical Results , 2019, ArXiv.

[10]  Yu-Feng Li,et al.  Safe semi-supervised learning: a brief introduction , 2019, Frontiers of Computer Science.

[11]  Tolga Tasdizen,et al.  Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[12]  Shaogang Gong,et al.  Semi-Supervised Learning under Class Distribution Mismatch , 2020, AAAI.

[13]  Mahdi Milani Fard,et al.  Metric-Optimized Example Weights , 2018, ICML.

[14]  Yoav Freund,et al.  Optimally Combining Classifiers Using Unlabeled Data , 2015, COLT.

[15]  Kalyanmoy Deb,et al.  A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications , 2017, IEEE Transactions on Evolutionary Computation.

[16]  Zhi-Hua Zhou,et al.  Towards Safe Weakly Supervised Learning , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Robert D. Nowak,et al.  Unlabeled data: Now it helps, now it doesn't , 2008, NIPS.

[18]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[19]  Ji Feng,et al.  Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.

[20]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[21]  Jonathan F. Bard,et al.  Practical Bilevel Optimization: Algorithms and Applications , 1998 .

[22]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[24]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Zhi-Hua Zhou,et al.  Learning Safe Prediction for Semi-Supervised Regression , 2017, AAAI.

[26]  Michael R. Lyu,et al.  Can irrelevant data help semi-supervised learning, why and how? , 2011, CIKM '11.

[27]  Eran Segal,et al.  Regularization Learning Networks , 2018, NeurIPS.

[28]  Colin Raffel,et al.  Realistic Evaluation of Deep Semi-Supervised Learning Algorithms , 2018, NeurIPS.

[29]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[30]  Marco Loog,et al.  Projected estimators for robust semi-supervised classification , 2016, Machine Learning.

[31]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[32]  Fabio Gagliardi Cozman,et al.  Semi-Supervised Learning of Mixture Models , 2003, ICML.

[33]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.