Non-linear Convolution Filters for CNN-Based Learning

During the last years, Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in image classification. Their architectures have largely drawn inspiration by models of the primate visual system. However, while recent research results of neuroscience prove the existence of non-linear operations in the response of complex visual cells, little effort has been devoted to extend the convolution technique to non-linear forms. Typical convolutional layers are linear systems, hence their expressiveness is limited. To overcome this, various non-linearities have been used as activation functions inside CNNs, while also many pooling strategies have been applied. We address the issue of developing a convolution method in the context of a computational model of the visual cortex, exploring quadratic forms through the Volterra kernels. Such forms, constituting a more rich function space, are used as approximations of the response profile of visual cells. Our proposed second-order convolution is tested on CIFAR-10 and CIFAR-100. We show that a network which combines linear and non-linear filters in its convolutional layers, can outperform networks that use standard linear filters with the same architecture, yielding results competitive with the state-of-the-art on these datasets.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[3]  Patrice Y. Simard,et al.  High Performance Convolutional Neural Networks for Document Processing , 2006 .

[4]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[5]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6]  Gang Wang,et al.  DelugeNets: Deep Networks with Efficient and Flexible Cross-Layer Information Inflows , 2016, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[7]  W. M. Keck,et al.  Highly Selective Receptive Fields in Mouse Visual Cortex , 2008, The Journal of Neuroscience.

[8]  S Marcelja,et al.  Mathematical description of the responses of simple cortical cells. , 1980, Journal of the Optical Society of America.

[9]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Mark Gerstein,et al.  Interpretable Sparse High-Order Boltzmann Machines , 2014, AISTATS.

[11]  Dacheng Tao,et al.  Improving Training of Deep Neural Networks via Singular Value Bounding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[13]  L. Palmer,et al.  The two-dimensional spatial structure of nonlinear subunits in the receptive fields of complex cells , 1990, Vision Research.

[14]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[15]  Junmo Kim,et al.  Deep Pyramidal Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[17]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[18]  J. Movshon,et al.  Spatial summation in the receptive fields of simple cells in the cat's striate cortex. , 1978, The Journal of physiology.

[19]  Peter Földiák,et al.  Stimulus optimisation in primary visual cortex , 2001, Neurocomputing.

[20]  Gang Wang,et al.  DelugeNets: Deep Networks with Massive and Flexible Cross-layer Information Inflows , 2016, ArXiv.

[21]  Max Welling,et al.  Steerable CNNs , 2016, ICLR.

[22]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[23]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[24]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[25]  Qiang Qiu,et al.  Oriented Response Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Zhuowen Tu,et al.  Deeply-Supervised Nets , 2014, AISTATS.

[27]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[28]  Vito Volterra,et al.  Theory of Functionals and of Integral and Integro-Differential Equations , 2005 .

[29]  Laurenz Wiskott,et al.  On the Analysis and Interpretation of Inhomogeneous Quadratic Forms as Receptive Fields , 2006, Neural Computation.

[30]  Baba C. Vemuri,et al.  Volterrafaces: Discriminant analysis using Volterra kernels , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  F. Xavier Roca,et al.  Regularizing CNNs with Locally Constrained Decorrelations , 2016, ICLR.

[32]  Geoffrey E. Hinton,et al.  Learning symmetry groups with hidden units: beyond the perceptron , 1986 .

[33]  Xiaolin Hu,et al.  Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  David S. Touretzky,et al.  Advances in neural information processing systems 2 , 1989 .

[35]  Joaquín Rapela,et al.  Estimating nonlinear receptive fields from natural images. , 2006, Journal of vision.

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).