Close Category Generalization for Out-of-Distribution Classification

Out-of-distribution generalization is a core challenge in machine learning. We introduce and propose a solution to a new type of out-of-distribution evaluation, which we call close category generalization. This task specifies how a classifier should extrapolate to unseen classes by considering a bi-criteria objective: (i) on in-distribution examples, output the correct label, and (ii) on out-of-distribution examples, output the label of the nearest neighbor in the training set. In addition to formalizing this problem, we present a new training algorithm to improve the close category generalization of neural networks. We compare to many baselines, including robust algorithms and out-of-distribution detection methods, and we show that our method has better or comparable close category generalization. Then, we investigate a related representation learning task, and we find that performing well on close category generalization correlates with learning a good representation of an unseen class and with finding a good initialization for few-shot learning. Code available at this https URL

[1]  Fei-FeiLi,et al.  One-Shot Learning of Object Categories , 2006 .

[2]  Matthias Hein,et al.  Towards neural networks that provably know when they don't know , 2020, ICLR.

[3]  Shruti Tople,et al.  Domain Generalization using Causal Matching , 2020, ICML.

[4]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[5]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[6]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[7]  Supriyo Chakraborty,et al.  Improving Adversarial Robustness Through Progressive Hardening , 2020, ArXiv.

[8]  Matthias Hein,et al.  Provable Worst Case Guarantees for the Detection of Out-of-Distribution Data , 2020, ArXiv.

[9]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[10]  Cyrus Rashtchian,et al.  A Closer Look at Accuracy vs. Robustness , 2020, NeurIPS.

[11]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[13]  Stefano Soatto,et al.  A Baseline for Few-Shot Image Classification , 2019, ICLR.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Steffen Bickel,et al.  Discriminative Learning Under Covariate Shift , 2009, J. Mach. Learn. Res..

[16]  James T. Kwok,et al.  Generalizing from a Few Examples , 2019, ACM Comput. Surv..

[17]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[18]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[19]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[20]  Jon Kleinberg,et al.  Transfusion: Understanding Transfer Learning for Medical Imaging , 2019, NeurIPS.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[23]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[24]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[25]  Michael Fink,et al.  Object Classification from a Single Example Utilizing Class Relevance Metrics , 2004, NIPS.

[26]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[27]  Dylan Hadfield-Menell,et al.  Adversarial Training with Voronoi Constraints , 2019, ArXiv.

[28]  Cyrus Rashtchian,et al.  Adversarial Robustness Through Local Lipschitzness , 2020, ArXiv.

[29]  Gang Yu,et al.  High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Hugo Terashima-Marín,et al.  Learning from Few Samples: A Survey , 2020, ArXiv.

[31]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[33]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[34]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[35]  Nathan Srebro,et al.  Exploring Generalization in Deep Learning , 2017, NIPS.

[36]  Pushmeet Kohli,et al.  Adversarial Robustness through Local Linearization , 2019, NeurIPS.

[37]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[38]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[39]  Michael W. Mahoney,et al.  Adversarially-Trained Deep Nets Transfer Better , 2020, ArXiv.

[40]  Lionel M. Ni,et al.  Generalizing from a Few Examples , 2020, ACM Comput. Surv..

[41]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[42]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[43]  Yi Zhang,et al.  Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.

[44]  Ashish Kapoor,et al.  Do Adversarially Robust ImageNet Models Transfer Better? , 2020, NeurIPS.

[45]  Pin-Yu Chen,et al.  CAT: Customized Adversarial Training for Improved Robustness , 2020, IJCAI.

[46]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[47]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[48]  Jasper Snoek,et al.  Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.

[49]  Matthias Hein,et al.  Certifiably Adversarially Robust Detection of Out-of-Distribution Data , 2020, NeurIPS.