论文信息 - Editing a classifier by rewriting its prediction rules

Editing a classifier by rewriting its prediction rules

We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules.1 Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features.

[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2] Koby Crammer,et al. Analysis of Representations for Domain Adaptation , 2006, NIPS.

[3] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Pascal Vincent,et al. Visualizing Higher-Layer Features of a Deep Network , 2009 .

[5] Bernt Schiele,et al. Adversarial Scene Editing: Automatic Object Removal from Weak Supervision , 2018, NeurIPS.

[6] Tengyu Ma,et al. Understanding Self-Training for Gradual Domain Adaptation , 2020, ICML.

[7] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[8] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[9] Aleksander Madry,et al. Exploring the Landscape of Spatial Robustness , 2017, ICML.

[10] Percy Liang,et al. Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.

[11] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[12] Yash Goyal,et al. Explaining Classifiers with Causal Concept Effect (CaCE) , 2019, ArXiv.

[13] Ekin D. Cubuk,et al. Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation , 2019, ArXiv.

[14] Aude Oliva,et al. GANalyze: Toward Visual Definitions of Cognitive Image Properties , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15] Aleksander Madry,et al. Noise or Signal: The Role of Image Backgrounds in Object Recognition , 2020, ICLR.

[16] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17] Nicolas Courty,et al. Optimal Transport for Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Aleksander Madry,et al. Adversarial Robustness as a Prior for Learned Representations , 2019 .

[20] M. Bethge,et al. Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[21] Aleksander Madry,et al. From ImageNet to Image Classification: Contextualizing Progress on Benchmarks , 2020, ICML.

[22] Ekin D. Cubuk,et al. A Fourier Perspective on Model Robustness in Computer Vision , 2019, NeurIPS.

[23] Dani Lischinski,et al. StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Zhifang Sui,et al. Knowledge Neurons in Pretrained Transformers , 2021, ArXiv.

[25] Chelsea Finn,et al. Fast Model Editing at Scale , 2021, ArXiv.

[26] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[27] Bolei Zhou,et al. Understanding the role of individual units in a deep neural network , 2020, Proceedings of the National Academy of Sciences.

[28] Been Kim,et al. Concept Bottleneck Models , 2020, ICML.

[29] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[30] Bolei Zhou,et al. Interpretable Basis Decomposition for Visual Explanation , 2018, ECCV.

[31] Yi Sun,et al. Testing Robustness Against Unforeseen Adversaries , 2019, ArXiv.

[32] Pietro Perona,et al. Recognition in Terra Incognita , 2018, ECCV.

[33] Alexei A. Efros,et al. Undoing the Damage of Dataset Bias , 2012, ECCV.

[34] Simon Hessner,et al. Image Style Transfer using Convolutional Neural Networks , 2018 .

[35] David Bau,et al. Rewriting a Deep Generative Model , 2020, ECCV.

[36] Christoph H. Lampert,et al. Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[39] Balaji Lakshminarayanan,et al. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty , 2020, ICLR.

[40] Phillip Isola,et al. On the "steerability" of generative adversarial networks , 2019, ICLR.

[41] Bolei Zhou,et al. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks , 2018, ICLR.

[42] Christina Heinze-Deml,et al. Conditional variance penalties and domain shift robustness , 2017, Machine Learning.

[43] C. Rudin,et al. Concept whitening for interpretable image recognition , 2020, Nature Machine Intelligence.

[44] Bolei Zhou,et al. Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[46] Bernt Schiele,et al. Interpretability Beyond Classification Output: Semantic Bottleneck Networks , 2019, ArXiv.

[47] Eduardo Valle,et al. Debiasing Skin Lesion Datasets and Models? Not So Fast , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[48] Bolei Zhou,et al. Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[50] Mario Fritz,et al. Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Jure Leskovec,et al. WILDS: A Benchmark of in-the-Wild Distribution Shifts , 2021, ICML.

[52] Seyed-Mohsen Moosavi-Dezfooli,et al. Robustness of classifiers: from adversarial to random noise , 2016, NIPS.

[53] Honglak Lee,et al. Exploring the structure of a real-time, arbitrary neural artistic stylization network , 2017, BMVC.

[54] Quinn Jones,et al. Few-Shot Adversarial Domain Adaptation , 2017, NIPS.

[55] Bolei Zhou,et al. Interpreting the Latent Space of GANs for Semantic Face Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Tinne Tuytelaars,et al. A Testbed for Cross-Dataset Analysis , 2014, ECCV Workshops.

[57] Eduard Hovy,et al. Learning the Difference that Makes a Difference with Counterfactually-Augmented Data , 2020, ICLR.

[58] Moustapha Cissé,et al. ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases , 2017, ECCV.

[59] Deborah Silver,et al. Feature Visualization , 1994, Scientific Visualization.

[60] Judea Pearl,et al. Causal Inference , 2010 .

[61] Jaakko Lehtinen,et al. GANSpace: Discovering Interpretable GAN Controls , 2020, NeurIPS.

[62] Ross B. Girshick,et al. LVIS: A Dataset for Large Vocabulary Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63] Percy Liang,et al. Distributionally Robust Language Modeling , 2019, EMNLP.

[64] Bernt Schiele,et al. Not Using the Car to See the Sidewalk — Quantifying and Controlling the Effects of Context in Classification and Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65] Trevor Darrell,et al. Adapting Visual Category Models to New Domains , 2010, ECCV.

[66] Alexei A. Efros,et al. Unbiased look at dataset bias , 2011, CVPR 2011.

[67] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[68] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[69] Nic Ford,et al. Adversarial Examples Are a Natural Consequence of Test Error in Noise , 2019, ICML.

[70] Pascal Frossard,et al. Manitest: Are classifiers really invariant? , 2015, BMVC.

[71] Ziyan Wu,et al. Counterfactual Visual Explanations , 2019, ICML.

[72] Boris Katz,et al. ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models , 2019, NeurIPS.

[73] Nicola De Cao,et al. Editing Factual Knowledge in Language Models , 2021, EMNLP.

[74] David Lopez-Paz,et al. Invariant Risk Minimization , 2019, ArXiv.

[75] Thomas G. Dietterich,et al. Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations , 2018, 1807.01697.

[76] Jiaying Liu,et al. Revisiting Batch Normalization For Practical Domain Adaptation , 2016, ICLR.

[77] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[78] Aleksander Madry,et al. BREEDS: Benchmarks for Subpopulation Shift , 2020, ICLR.

[79] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[80] Xiaohua Zhai,et al. Are we done with ImageNet? , 2020, ArXiv.

[81] Jacob Steinhardt,et al. Limitations of Post-Hoc Feature Alignment for Robustness , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[82] Arvind Satyanarayan,et al. The Building Blocks of Interpretability , 2018 .

[83] Alec Radford,et al. Multimodal Neurons in Artificial Neural Networks , 2021 .

[84] Aleksander Madry,et al. Leveraging Sparse Linear Layers for Debuggable Deep Networks , 2021, ICML.

[85] J. Dunning. The elephant in the room. , 2013, European journal of cardio-thoracic surgery : official journal of the European Association for Cardio-thoracic Surgery.

[86] Bernhard Schölkopf,et al. Domain Adaptation with Conditional Transferable Components , 2016, ICML.