Causal Interventional Training for Image Recognition

Deep learning models often fit undesired dataset bias in training. In this paper, we formulate the bias using causal inference, which helps us uncover the ever-elusive causalities among the key factors in training, and thus pursue the desired causal effect without the bias. We start from revisiting the process of building a visual recognition system, and then propose a structural causal model (SCM) for the key variables involved in dataset collection and recognition model: object, common sense, bias, context, and label prediction. Based on the SCM, one can observe that there are “good” and “bad” biases. Intuitively, in the image where a car is driving on a high way in a desert, the “good” bias denoting the common-sense context is the highway, and the “bad” bias accounting for the noisy context factor is the desert. We tackle this problem with a novel causal interventional training (CIT) approach, where we control the observed context in each object class. We offer theoretical justifications for CIT and validate it with extensive classification experiments on CIFAR-10, CIFAR-100 and ImageNet, e.g., surpassing the standard deep neural networks ResNet-34 and ResNet-50, respectively, by 0.95% and 0.70% accuracies on the ImageNet. Our code is open-sourced on the GitHub https://github.com/qinwei-hfut/CIT.

[1]  Tat-Seng Chua,et al.  Interventional Video Relation Detection , 2021, ACM Multimedia.

[2]  Meng Wang,et al.  Deconfounded Video Moment Retrieval with Causal Intervention , 2021, SIGIR.

[3]  Li Zhang,et al.  Improving Weakly Supervised Object Localization via Causal Intervention , 2021, ACM Multimedia.

[4]  Mohit Prabhushankar,et al.  Extracting Causal Visual Features For Limited Label Classification , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[5]  Liang Zheng,et al.  Category-Level Adversarial Adaptation for Semantic Segmentation Using Purified Features , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Meng Wang,et al.  Unpaired Image Captioning With semantic-Constrained Self-Learning , 2021, IEEE Transactions on Multimedia.

[7]  Samy Bengio,et al.  Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.

[8]  Charles Blundell,et al.  Representation Learning via Invariant Causal Mechanisms , 2020, ICLR.

[9]  Jinhui Tang,et al.  Causal Intervention for Weakly-Supervised Semantic Segmentation , 2020, NeurIPS.

[10]  Gal Chechik,et al.  A causal view of compositional zero-shot recognition , 2020, NeurIPS.

[11]  Richang Hong,et al.  Deep Neighborhood Component Analysis for Visual Similarity Modeling , 2020, ACM Trans. Intell. Syst. Technol..

[12]  Yueting Zhuang,et al.  Frame Augmented Alternating Attention Network for Video Question Answering , 2020, IEEE Transactions on Multimedia.

[13]  Jianqiang Huang,et al.  Unbiased Scene Graph Generation From Biased Training , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Hanwang Zhang,et al.  Visual Commonsense R-CNN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yong Jae Lee,et al.  Don’t Judge an Object by Its Context: Learning to Overcome Contextual Bias , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Mario Fritz,et al.  Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Amit Sharma,et al.  Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers , 2019, ArXiv.

[18]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[19]  Walter Karlen,et al.  CXPlain: Causal Explanations for Model Interpretation under Uncertainty , 2019, NeurIPS.

[20]  Meng Wang,et al.  Person Reidentification via Structural Deep Metric Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Zachary Chase Lipton,et al.  Learning the Difference that Makes a Difference with Counterfactually-Augmented Data , 2019, ICLR.

[22]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[23]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[24]  Matthew S. Fritz,et al.  Mediation analysis. , 2019, Annual review of psychology.

[25]  Ryan Cotterell,et al.  Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology , 2019, ACL.

[26]  Xinbo Gao,et al.  Data Augmentation-Based Joint Learning for Heterogeneous Face Recognition , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Roger Zimmermann,et al.  Towards Natural and Accurate Future Motion Prediction of Humans and Animals , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Junmo Kim,et al.  Learning Not to Learn: Training Deep Neural Networks With Biased Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Stefan Bauer,et al.  Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness , 2018, ICML.

[31]  Weng-Keen Wong,et al.  Open Set Learning with Counterfactual Images , 2018, ECCV.

[32]  Andrew Zisserman,et al.  Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings , 2018, ECCV Workshops.

[33]  Bogdan Raducanu,et al.  Saliency for Fine-grained Object Recognition in Domains with Scarce Training Data , 2018, Pattern Recognit..

[34]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[35]  Mihaela van der Schaar,et al.  GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets , 2018, ICLR.

[36]  Gauthier Lafruit,et al.  Robust Multiview Synthesis for Wide-Baseline Camera Arrays , 2018, IEEE Transactions on Multimedia.

[37]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[38]  Ohad Ben-Shahar,et al.  Exploring the Bounds of the Utility of Context for Object Detection , 2017, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[40]  Joris M. Mooij,et al.  Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions , 2017, NeurIPS.

[41]  Geoffrey E. Hinton,et al.  Who Said What: Modeling Individual Labelers Improves Classification , 2017, AAAI.

[42]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[43]  Jimeng Sun,et al.  Causal Regularization , 2019, NeurIPS.

[44]  Alan L. Yuille,et al.  Object Recognition with and without Objects , 2016, IJCAI.

[45]  Bernhard Schölkopf,et al.  Discovering Causal Signals in Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Deyu Meng,et al.  Two-Stream Contextualized CNN for Fine-Grained Image Classification , 2016, AAAI.

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[49]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[50]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[51]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[52]  Lorenzo Richiardi,et al.  Mediation analysis in epidemiology: methods, interpretation and bias. , 2013, International journal of epidemiology.

[53]  J. Pearl Interpretation and Identification of Causal Mediation , 2013, Psychological methods.

[54]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[55]  Dan Geiger,et al.  d-Separation: From Theorems to Algorithms , 2013, UAI.

[56]  Tyler J. VanderWeele,et al.  On the definition of a confounder , 2013, Annals of statistics.

[57]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[58]  Judea Pearl,et al.  The Do-Calculus Revisited , 2012, UAI.

[59]  Yasuo Kuniyoshi,et al.  Causal Flow , 2011, IEEE Transactions on Multimedia.

[60]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[61]  J. Pearl Causal inference in statistics: An overview , 2009 .

[62]  Alexei A. Efros,et al.  An empirical study of context in object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  R. McNamee,et al.  Confounding and confounders , 2003, Occupational and environmental medicine.

[64]  Jakub M. Tomczak,et al.  Selecting Data Augmentation for Simulating Interventions , 2021, ICML.

[65]  Judea Pearl,et al.  Comment: Graphical Models, Causality and Intervention , 2016 .

[66]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[67]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[68]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[69]  J. Pearl,et al.  Causal diagrams for epidemiologic research. , 1999, Epidemiology.