Equivariant Contrastive Learning

In state-of-the-art self-supervised learning (SSL) pre-training produces semantically good representations by encouraging them to be invariant under meaningful transformations prescribed from human knowledge. In fact, the property of invariance is a trivial instance of a broader class called equivariance, which can be intuitively understood as the property that representations transform according to the way the inputs transform. Here, we show that rather than using only invariance, pre-training that encourages non-trivial equivariance to some transformations, while maintaining invariance to other transformations, can be used to improve the semantic quality of representations. Specifically, we extend popular SSL methods to a more general framework which we name Equivariant SelfSupervised Learning (E-SSL). In E-SSL, a simple additional pre-training objective encourages equivariance by predicting the transformations applied to the input. We demonstrate E-SSL’s effectiveness empirically on several popular computer vision benchmarks. Furthermore, we demonstrate usefulness of E-SSL for applications beyond computer vision; in particular, we show its utility on regression problems in photonics science. We will release our code.

[1]  Yann LeCun,et al.  Barlow Twins: Self-Supervised Learning via Redundancy Reduction , 2021, ICML.

[2]  Jitendra Malik,et al.  Generic 3D Representation via Pose Estimation and Matching , 2016, ECCV.

[3]  Steven G. Johnson,et al.  Block-iterative frequency-domain methods for Maxwell's equations in a planewave basis. , 2001, Optics express.

[4]  Tom Rainforth,et al.  Improving Transformation Invariance in Contrastive Representation Learning , 2020, ICLR.

[5]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[6]  Alexei A. Efros,et al.  What Should Not Be Contrastive in Contrastive Learning , 2020, ICLR.

[7]  Shin'ya Yamaguchi,et al.  Image Enhanced Rotation Prediction for Self-Supervised Learning , 2019, 2021 IEEE International Conference on Image Processing (ICIP).

[8]  Julien Mairal,et al.  Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Location and topology of the fundamental gap in photonic crystals , 2021, 2106.10267.

[11]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[13]  Kurt Keutzer,et al.  Evaluating Self-Supervised Pretraining Without Using Labels , 2020, ArXiv.

[14]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[15]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[16]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[17]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[18]  Steven G. Johnson,et al.  Photonic Crystals: Molding the Flow of Light , 1995 .

[19]  Mike Wu,et al.  Viewmaker Networks: Learning Views for Unsupervised Representation Learning , 2020, ArXiv.

[20]  Dacheng Tao,et al.  Self-Supervised Representation Learning by Rotation Feature Decoupling , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jiebo Luo,et al.  AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations Rather Than Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jasper Snoek,et al.  Scalable and Flexible Deep Bayesian Optimization with Auxiliary Information for Scientific Problems , 2021, ArXiv.

[23]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[24]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Andrea Vedaldi,et al.  Understanding Image Representations by Measuring Their Equivariance and Equivalence , 2014, International Journal of Computer Vision.

[26]  Kristen Grauman,et al.  Learning Image Representations Tied to Ego-Motion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[28]  E. Yablonovitch,et al.  Inhibited spontaneous emission in solid-state physics and electronics. , 1987, Physical review letters.

[29]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  M. Soljačić,et al.  Predictive and generative machine learning models for photonic crystals , 2020, Nanophotonics.

[31]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[32]  Cordelia Schmid,et al.  What makes for good views for contrastive learning , 2020, NeurIPS.

[33]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[34]  Michal Valko,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[35]  Dahua Lin,et al.  Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination , 2018, ArXiv.

[36]  Joan Bruna,et al.  Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges , 2021, ArXiv.

[37]  Jinwoo Shin,et al.  i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning , 2021, ICLR.

[38]  Steven G. Johnson,et al.  Generalized Gilat–Raubenheimer method for density-of-states calculation in photonic crystals , 2017, 1711.07993.

[39]  Guo-Jun Qi,et al.  Contrastive Learning With Stronger Augmentations , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Patrick Pérez,et al.  Boosting Few-Shot Visual Learning With Self-Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Quoc V. Le,et al.  Towards Domain-Agnostic Contrastive Learning , 2020, ICML.

[42]  Jeremy Kepner,et al.  Interactive Supercomputing on 40,000 Cores for Machine Learning and Data Analysis , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).

[43]  Jitendra Malik,et al.  Learning to See by Moving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44]  Marin Soljacic,et al.  Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science , 2021, ArXiv.

[45]  Liheng Zhang,et al.  Equivariance and Invariance for Robust Unsupervised and Semi-Supervised Learning , 2020 .

[46]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .