OneDConv: Generalized Convolution For Transform-Invariant Representation

Convolutional Neural Networks (CNNs) have exhibited their great power in a variety of vision tasks. However, the lack of transform-invariant property limits their further applications in complicated real-world scenarios. In this work, we proposed a novel generalized one dimension convolutional operator (OneDConv), which dynamically transforms the convolution kernels based on the input features in a computationally and parametrically efficient manner. The proposed operator can extract the transform-invariant features naturally. It improves the robustness and generalization of convolution without sacrificing the performance on common images. The proposed OneDConv operator can substitute the vanilla convolution, thus it can be incorporated into current popular convolutional architectures and trained end-to-end readily. On several popular benchmarks, OneDConv outperforms the original convolution operation and other proposed models both in canonical and distorted images.

[1]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Yong Liu,et al.  Feature Lenses: Plug-and-play Neural Modules for Transformation-Invariant Visual Representations , 2020, ArXiv.

[3]  Stephan J. Garbin,et al.  Harmonic Networks: Deep Translation and Rotation Equivariance , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  M. Varacallo,et al.  2019 , 2019, Journal of Surgical Orthopaedic Advances.

[5]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[6]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Proceedings of the IEEE , 2018, IEEE Journal of Emerging and Selected Topics in Power Electronics.

[10]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[11]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.

[12]  Dacheng Tao,et al.  Transform-Invariant Convolutional Neural Networks for Image Classification and Search , 2016, ACM Multimedia.

[13]  Appendix to “ Bayesian Mixtures of Autoregressive Models ” published in the Journal of Computational and Graphical Statistics , 2010 .

[14]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[15]  Guanghui Wang,et al.  Towards Learning Affine-Invariant Representations via Data-Efficient CNNs , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[16]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[17]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Junmo Kim,et al.  Deep Pyramidal Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Klara Nahrstedt,et al.  Proceedings of the 24th ACM international conference on Multimedia , 2006, MM 2006.

[20]  D. Gabor,et al.  Theory of communication. Part 1: The analysis of information , 1946 .

[21]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[24]  Joachim M. Buhmann,et al.  TI-POOLING: Transformation-Invariant Pooling for Feature Learning in Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Chen Chen,et al.  Gabor Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).