Image classification in frequency domain with 2SReLU: a second harmonics superposition activation function

Deep Convolutional Neural Networks are able to identify complex patterns and perform tasks with super-human capabilities. However, besides the exceptional results, they are not completely understood and it is still impractical to hand-engineer similar solutions. In this work, an image classification Convolutional Neural Network and its building blocks are described from a frequency domain perspective. Some network layers have established counterparts in the frequency domain like the convolutional and pooling layers. We propose the 2SReLU layer, a novel non-linear activation function that preserves high frequency components in deep networks. It is demonstrated that in the frequency domain it is possible to achieve competitive results without using the computationally costly convolution operation. A source code implementation in PyTorch is provided at: this https URL

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Jasper Snoek,et al.  Spectral Representations for Convolutional Neural Networks , 2015, NIPS.

[3]  Ab Al-Hadi Ab Rahman,et al.  Spectral-based convolutional neural network without multiple spatial-frequency domain switchings , 2019, Neurocomputing.

[4]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[5]  Bogdan M. Wilamowski,et al.  Discrete Cosine Transform Spectral Pooling Layers for Convolutional Neural Networks , 2018, ICAISC.

[6]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[7]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Nitzan Guberman,et al.  On Complex Valued Convolutional Neural Networks , 2016, ArXiv.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Hao Zhang,et al.  Hartley Spectral Pooling for Deep Learning , 2018, CSIAM Transactions on Applied Mathematics.

[12]  Frans Coenen,et al.  FCNN: Fourier Convolutional Neural Networks , 2017, ECML/PKDD.

[13]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[14]  Yoshua Bengio,et al.  Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[15]  Ming-Hsuan Yang,et al.  DFT-based Transformation Invariant Pooling Layer for Visual Classification , 2018, ECCV.

[16]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[17]  Yuhao Xu,et al.  DCT Based Information-Preserving Pooling for Deep Neural Networks , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[18]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[19]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[20]  Sandeep Subramanian,et al.  Deep Complex Networks , 2017, ICLR.

[21]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[22]  Qiang Lan,et al.  Combining FFT and Spectral-Pooling for Efficient Convolution Neural Network Model , 2016 .