论文信息 - Non-linear Convolution Filters for CNN-Based Learning

Non-linear Convolution Filters for CNN-Based Learning

During the last years, Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in image classification. Their architectures have largely drawn inspiration by models of the primate visual system. However, while recent research results of neuroscience prove the existence of non-linear operations in the response of complex visual cells, little effort has been devoted to extend the convolution technique to non-linear forms. Typical convolutional layers are linear systems, hence their expressiveness is limited. To overcome this, various non-linearities have been used as activation functions inside CNNs, while also many pooling strategies have been applied. We address the issue of developing a convolution method in the context of a computational model of the visual cortex, exploring quadratic forms through the Volterra kernels. Such forms, constituting a more rich function space, are used as approximations of the response profile of visual cells. Our proposed second-order convolution is tested on CIFAR-10 and CIFAR-100. We show that a network which combines linear and non-linear filters in its convolutional layers, can outperform networks that use standard linear filters with the same architecture, yielding results competitive with the state-of-the-art on these datasets.

[1] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[3] Patrice Y. Simard,et al. High Performance Convolutional Neural Networks for Document Processing , 2006 .

[4] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.

[5] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6] Gang Wang,et al. DelugeNets: Deep Networks with Efficient and Flexible Cross-Layer Information Inflows , 2016, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[7] W. M. Keck,et al. Highly Selective Receptive Fields in Mouse Visual Cortex , 2008, The Journal of Neuroscience.

[8] S Marcelja,et al. Mathematical description of the responses of simple cortical cells. , 1980, Journal of the Optical Society of America.

[9] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Mark Gerstein,et al. Interpretable Sparse High-Order Boltzmann Machines , 2014, AISTATS.

[11] Dacheng Tao,et al. Improving Training of Deep Neural Networks via Singular Value Bounding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[13] L. Palmer,et al. The two-dimensional spatial structure of nonlinear subunits in the receptive fields of complex cells , 1990, Vision Research.

[14] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[15] Junmo Kim,et al. Deep Pyramidal Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[17] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[18] J. Movshon,et al. Spatial summation in the receptive fields of simple cells in the cat's striate cortex. , 1978, The Journal of physiology.

[19] Peter Földiák,et al. Stimulus optimisation in primary visual cortex , 2001, Neurocomputing.

[20] Gang Wang,et al. DelugeNets: Deep Networks with Massive and Flexible Cross-layer Information Inflows , 2016, ArXiv.

[21] Max Welling,et al. Steerable CNNs , 2016, ICLR.

[22] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[23] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[24] Qiang Chen,et al. Network In Network , 2013, ICLR.

[25] Qiang Qiu,et al. Oriented Response Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.

[27] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[28] Vito Volterra,et al. Theory of Functionals and of Integral and Integro-Differential Equations , 2005 .

[29] Laurenz Wiskott,et al. On the Analysis and Interpretation of Inhomogeneous Quadratic Forms as Receptive Fields , 2006, Neural Computation.

[30] Baba C. Vemuri,et al. Volterrafaces: Discriminant analysis using Volterra kernels , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31] F. Xavier Roca,et al. Regularizing CNNs with Locally Constrained Decorrelations , 2016, ICLR.

[32] Geoffrey E. Hinton,et al. Learning symmetry groups with hidden units: beyond the perceptron , 1986 .

[33] Xiaolin Hu,et al. Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] David S. Touretzky,et al. Advances in neural information processing systems 2 , 1989 .

[35] Joaquín Rapela,et al. Estimating nonlinear receptive fields from natural images. , 2006, Journal of vision.

[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).