论文信息 - Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

In this paper, we propose Test-Time Training, a general approach for improving the performance of predictive models when training and test data come from different distributions. We turn a single unlabeled test sample into a self-supervised learning problem, on which we update the model parameters before making a prediction. This also extends naturally to data in an online stream. Our simple approach leads to improvements on diverse image classification benchmarks aimed at evaluating robustness to distribution shifts.

[1] Michal Irani,et al. Super-resolution from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2] Fabio Maria Carlucci,et al. Domain Generalization by Solving Jigsaw Puzzles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Moustapha Cissé,et al. Countering Adversarial Images using Input Transformations , 2018, ICLR.

[4] Matthias Hein,et al. Provable Robustness of ReLU networks via Maximization of Linear Regions , 2018, AISTATS.

[5] Gregory Shakhnarovich,et al. Colorization as a Proxy Task for Visual Understanding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Xiaohui Xie,et al. Neural Multi-Scale Self-Supervised Registration for Echocardiogram Dense Tracking , 2019, bioRxiv.

[7] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8] Subhransu Maji,et al. Boosting Supervision with Self-Supervision for Few-shot Learning , 2019, ArXiv.

[9] Alexander Gammerman,et al. Learning by Transduction , 1998, UAI.

[10] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[11] Mehryar Mohri,et al. Algorithms and Theory for Multiple-Source Adaptation , 2018, NeurIPS.

[12] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[13] Luyu Wang,et al. advertorch v0.1: An Adversarial Robustness Toolbox based on PyTorch , 2019, ArXiv.

[14] Yang Song,et al. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[15] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[16] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[17] Dawn Song,et al. Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.

[18] D. Tao,et al. Deep Domain Generalization via Conditional Invariant Adversarial Networks , 2018, ECCV.

[19] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[20] Matthias Bethge,et al. Generalisation in humans and deep neural networks , 2018, NeurIPS.

[21] Yongxin Yang,et al. Deeper, Broader and Artier Domain Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[23] Deva Ramanan,et al. Online Model Distillation for Efficient Video Inference , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[25] Kaiming He,et al. Rethinking ImageNet Pre-Training , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26] Yi Sun,et al. Transfer of Adversarial Robustness Between Perturbation Types , 2019, ArXiv.

[27] Erik G. Learned-Miller,et al. Online domain adaptation of a pre-trained cascade of classifiers , 2011, CVPR 2011.

[28] Paolo Favaro,et al. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[29] Michal Irani,et al. "Zero-Shot" Super-Resolution Using Deep Internal Learning , 2017, CVPR.

[30] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[31] Thomas G. Dietterich,et al. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[32] Armand Joulin,et al. Unsupervised Learning by Predicting Noise , 2017, ICML.

[33] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[34] Bolei Zhou,et al. Semantic photo manipulation with a generative image prior , 2019, ACM Trans. Graph..

[35] Trevor Darrell,et al. Continuous Manifold Based Adaptation for Evolving Visual Domains , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[37] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[38] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[39] Trevor Darrell,et al. Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Siddhartha Chaudhuri,et al. Generalizing Across Domains via Cross-Gradient Training , 2018, ICLR.

[41] Hang Li,et al. Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[42] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[43] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[45] Swami Sankaranarayanan,et al. MetaReg: Towards Domain Generalization using Meta-Regularization , 2018, NeurIPS.

[46] J. Zico Kolter,et al. Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[47] Andrea Vedaldi,et al. Surprising Effectiveness of Few-Image Unsupervised Feature Learning , 2019, ArXiv.

[48] Aleksander Madry,et al. Robustness May Be at Odds with Accuracy , 2018, ICLR.

[49] Harini Kannan,et al. Adversarial Logit Pairing , 2018, NIPS 2018.

[50] Nilesh Tripuraneni,et al. Debiasing Linear Prediction , 2019, ArXiv.

[51] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[52] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[53] Trevor Darrell,et al. Discovering Latent Domains for Multisource Domain Adaptation , 2012, ECCV.

[54] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[55] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[56] Pietro Perona,et al. One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.

[58] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59] Kaiming He,et al. Group Normalization , 2018, ECCV.

[60] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[61] Claire Cardie,et al. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification , 2016, TACL.

[62] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[63] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[64] Mingjie Sun,et al. Rethinking the Value of Network Pruning , 2018, ICLR.

[65] Nikos Komodakis,et al. Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66] Nilesh Tripuraneni,et al. Single Point Transductive Prediction , 2019, ICML.

[67] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[68] Matthijs Douze,et al. Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[69] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[70] Bernhard Schölkopf,et al. Domain Generalization via Invariant Feature Representation , 2013, ICML.

[71] Mengjie Zhang,et al. Domain Generalization for Object Recognition with Multi-task Autoencoders , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[72] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[73] Yuan Shi,et al. Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[74] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[75] Benjamin Recht,et al. Do Image Classifiers Generalize Across Time , 2019 .

[76] Michael I. Jordan,et al. Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[77] Gabriela Csurka,et al. Domain Adaptation for Visual Applications: A Comprehensive Survey , 2017, ArXiv.

[78] Greg Yang,et al. Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[79] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[80] Aditi Raghunathan,et al. Certified Defenses against Adversarial Examples , 2018, ICLR.

[81] Donald A. Adjeroh,et al. Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[82] John Blitzer,et al. Co-Training for Domain Adaptation , 2011, NIPS.

[83] Alexei A. Efros,et al. Unsupervised Domain Adaptation through Self-Supervision , 2019, ArXiv.

[84] Kevin Gimpel,et al. Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise , 2018, NeurIPS.

[85] Tianbao Yang,et al. Learning Attributes Equals Multi-Source Domain Generalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86] Michael I. Jordan,et al. Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[87] Yongxin Yang,et al. Episodic Training for Domain Generalization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[88] Benjamin Recht,et al. Do CIFAR-10 Classifiers Generalize to CIFAR-10? , 2018, ArXiv.

[89] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[90] Kimin Lee,et al. Using Pre-Training Can Improve Model Robustness and Uncertainty , 2019, ICML.

[91] Taesung Park,et al. CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.