Out-of-Distribution Generalization via Risk Extrapolation (REx)

Generalizing outside of the training distribution is an open challenge for current machine learning systems. A weak form of out-of-distribution (OoD) generalization is the ability to successfully interpolate between multiple observed distributions. One way to achieve this is through robust optimization, which seeks to minimize the worst-case risk over convex combinations of the training distributions. However, a much stronger form of OoD generalization is the ability of models to extrapolate beyond the distributions observed during training. In pursuit of strong OoD generalization, we introduce the principle of Risk Extrapolation (REx). REx can be viewed as encouraging robustness over affine combinations of training risks, by encouraging strict equality between training risks. We show conceptually how this principle enables extrapolation, and demonstrate the effectiveness and scalability of instantiations of REx on various OoD generalization tasks. Our code can be found at this https URL.

[1]  Philip H. S. Torr,et al.  Gradient Matching for Domain Generalization , 2021, ICLR.

[2]  S Chandra Mouli,et al.  Neural Networks for Learning Counterfactual G-Invariances from Single Environments , 2021, ICLR.

[3]  George J. Pappas,et al.  Model-Based Domain Generalization , 2021, NeurIPS.

[4]  Uri Shalit,et al.  On Calibration and Out-of-domain Generalization , 2021, NeurIPS.

[5]  Pradeep Ravikumar,et al.  The Risks of Invariant Risk Minimization , 2020, ICLR.

[6]  Ken-ichi Kawarabayashi,et al.  How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks , 2020, ICLR.

[7]  Masanori Koyama,et al.  Out-of-Distribution Generalization with Maximal Invariant Predictor , 2020, ArXiv.

[8]  Jakub M. Tomczak,et al.  Designing Data Augmentation for Simulating Interventions , 2020, ArXiv.

[9]  P. Chalasani,et al.  Representation Bayesian Risk Decompositions and Multi-Source Domain Adaptation , 2020, ArXiv.

[10]  Junnan Li,et al.  Improving out-of-distribution generalization via multi-task self-supervised pretraining , 2020, ArXiv.

[11]  Geoffrey J. Gordon,et al.  Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift , 2020, NeurIPS.

[12]  Timothy A. Mann,et al.  Achieving Robustness in the Wild via Adversarial Mixing With Disentangled Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  J. Gilmer,et al.  AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty , 2019, ICLR.

[14]  Tatsunori B. Hashimoto,et al.  Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.

[15]  Ioannis Mitliagkas,et al.  Adversarial target-invariant representation learning for domain generalization , 2019, ArXiv.

[16]  Ioannis Mitliagkas,et al.  Generalizing to unseen domains via distribution matching , 2019, 1911.00804.

[17]  Aleksander Madry,et al.  Adversarial Robustness as a Prior for Learned Representations , 2019 .

[18]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[19]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[20]  Dawn Song,et al.  Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.

[21]  Ekin D. Cubuk,et al.  A Fourier Perspective on Model Robustness in Computer Vision , 2019, NeurIPS.

[22]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[23]  Peng Cui,et al.  Towards Non-I.I.D. image classification: A dataset and baselines , 2019, Pattern Recognit..

[24]  Aleksander Madry,et al.  Learning Perceptually-Aligned Representations via Adversarial Robustness , 2019, ArXiv.

[25]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[26]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[27]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[28]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2019, ICLR.

[29]  Fabio Maria Carlucci,et al.  Domain Generalization by Solving Jigsaw Puzzles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Rajesh Ranganath,et al.  Support and Invertibility in Domain-Invariant Representations , 2019, AISTATS.

[31]  Benjamin Recht,et al.  Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.

[32]  Kun Zhang,et al.  On Learning Invariant Representation for Domain Adaptation , 2019, ArXiv.

[33]  R. C. Williamson,et al.  Fairness risk measures , 2019, ICML.

[34]  P. Bühlmann,et al.  Invariance, Causality and Robustness , 2018, Statistical Science.

[35]  Matthias Bethge,et al.  Excessive Invariance Causes Adversarial Vulnerability , 2018, ICLR.

[36]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[37]  Eric P. Xing,et al.  Learning Robust Representations by Projecting Superficial Statistics Out , 2018, ICLR.

[38]  Thomas G. Dietterich,et al.  Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[39]  D. Tao,et al.  Deep Domain Generalization via Conditional Invariant Adversarial Networks , 2018, ECCV.

[40]  R. Devon Hjelm,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[41]  Pietro Perona,et al.  Recognition in Terra Incognita , 2018, ECCV.

[42]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[43]  Christoph H. Lampert,et al.  Learning Equations for Extrapolation and Control , 2018, ICML.

[44]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[45]  Rong Jin,et al.  Robust Optimization over Multiple Domains , 2018, AAAI.

[46]  Alexander J. Smola,et al.  Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.

[47]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[48]  N. Meinshausen,et al.  Anchor regression: Heterogeneous data meet causality , 2018, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[49]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[50]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[51]  Aleksander Madry,et al.  Exploring the Landscape of Spatial Robustness , 2017, ICML.

[52]  D. Janzing,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[53]  Christina Heinze-Deml,et al.  Conditional variance penalties and domain shift robustness , 2017, Machine Learning.

[54]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[55]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[56]  Yongxin Yang,et al.  Deeper, Broader and Artier Domain Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  N. Meinshausen,et al.  Invariant Causal Prediction for Nonlinear Models , 2017, Journal of Causal Inference.

[58]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[59]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[60]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Gang Niu,et al.  Does Distributionally Robust Supervised Learning Give Robust Classifiers? , 2016, ICML.

[62]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[63]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[64]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[65]  Elisabeth Köbis,et al.  On Robust Optimization , 2015, J. Optim. Theory Appl..

[66]  Razvan Pascanu,et al.  Natural Neural Networks , 2015, NIPS.

[67]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[68]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[69]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[70]  N. Meinshausen,et al.  Maximin effects in inhomogeneous large-scale data , 2014, 1406.0596.

[71]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[72]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[73]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[75]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[76]  Tyler Lu,et al.  Impossibility Theorems for Domain Adaptation , 2010, AISTATS.

[77]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[78]  Gary King,et al.  The Dangers of Extreme Counterfactuals , 2006, Political Analysis.

[79]  J. Andrew Bagnell,et al.  Robust Supervised Learning , 2005, AAAI.

[80]  R. Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[81]  Patrick Haffner,et al.  Escaping the Convex Hull with Extrapolated Vector Machines , 2001, NIPS.

[82]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[83]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[84]  Radford M. Neal Bayesian learning for neural networks , 1995 .

[85]  Causality : Models , Reasoning , and Inference , 2022 .