Improved Robustness to Open Set Inputs via Tempered Mixup

Supervised classification methods often assume that evaluation data is drawn from the same distribution as training data and that all classes are present for training. However, real-world classifiers must handle inputs that are far from the training distribution including samples from unknown classes. Open set robustness refers to the ability to properly label samples from previously unseen categories as novel and avoid high-confidence, incorrect predictions. Existing approaches have focused on either novel inference methods, unique training architectures, or supplementing the training data with additional background samples. Here, we propose a simple regularization technique easily applied to existing convolutional neural network architectures that improves open set robustness without a background dataset. Our method achieves state-of-the-art results on open set classification baselines and easily scales to large-scale open set classification problems.

[1]  Terrance E. Boult,et al.  Reducing Network Agnostophobia , 2018, NeurIPS.

[2]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[3]  Lei Shu,et al.  DOC: Deep Open Classification of Text Documents , 2017, EMNLP.

[4]  Jason Weston,et al.  Vicinal Risk Minimization , 2000, NIPS.

[5]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[6]  ScheirerWalter,et al.  Toward Open Set Recognition , 2013 .

[7]  Saman Ghili,et al.  Tiny ImageNet Visual Recognition Challenge , 2014 .

[8]  Rick Salay,et al.  Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output , 2019, ArXiv.

[9]  Terrance E. Boult,et al.  Towards Open World Recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ronald Kemker,et al.  Are Out-of-Distribution Detection Methods Effective on Large-Scale Datasets? , 2019, ArXiv.

[11]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[12]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[15]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Michael Healy,et al.  Theory and Applications of Ontology: Computer Applications , 2010 .

[17]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[18]  Seong Joon Oh,et al.  An Empirical Evaluation on Robustness and Uncertainty of Regularization Methods , 2020, ArXiv.

[19]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[21]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[22]  Ya Le,et al.  Tiny ImageNet Visual Recognition Challenge , 2015 .

[23]  Gopinath Chennupati,et al.  On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks , 2019, NeurIPS.

[24]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[26]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[27]  Vishal M. Patel,et al.  Learning Deep Features for One-Class Classification , 2018, IEEE Transactions on Image Processing.

[28]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[29]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Weng-Keen Wong,et al.  Open Set Learning with Counterfactual Images , 2018, ECCV.