Kervolutional Neural Networks

Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks. However, little effort has been devoted to establishing convolution in non-linear space. Existing works mainly leverage on the activation layers, which can only provide point-wise non-linearity. To solve this problem, a new operation, kervolution (kernel convolution), is introduced to approximate complex behaviors of human perception systems leveraging on the kernel trick. It generalizes convolution, enhances the model capacity, and captures higher order interactions of features, via patch-wise kernel functions, but without introducing additional parameters. Extensive experiments show that kervolutional neural networks (KNN) achieve higher accuracy and faster convergence than baseline CNN.

[1]  M. Larkum,et al.  Dendritic action potentials and computation in human layer 2/3 cortical neurons , 2020, Science.

[2]  Chen Wang Kernel learning for visual perception , 2019 .

[3]  Chen Wang,et al.  Ultra-wideband aided fast localization and mapping system , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[5]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Kunle Olukotun,et al.  DAWNBench : An End-to-End Deep Learning Benchmark and Competition , 2017 .

[7]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[9]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Naonori Ueda,et al.  Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms , 2016, ICML.

[13]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Cordelia Schmid,et al.  Convolutional Kernel Networks , 2014, NIPS.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Xiao Liu,et al.  Kernel Pooling for Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[19]  Heinrich Müller,et al.  SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Yoav Goldberg,et al.  splitSVM: Fast, Space-Efficient, non-Heuristic, Polynomial Kernel Computation for NLP Applications , 2008, ACL.

[23]  Chen Wang,et al.  Kernel Cross-Correlator , 2017, AAAI.

[24]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[25]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  C.-C. Jay Kuo Understanding convolutional neural networks with a mathematical model , 2016, J. Vis. Commun. Image Represent..

[27]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[29]  Andrea Vedaldi,et al.  Warped Convolutions: Efficient Invariance to Spatial Transformations , 2016, ICML.

[30]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[31]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[32]  Cristian Sminchisescu,et al.  Efficient Match Kernel between Sets of Features for Visual Recognition , 2009, NIPS.

[33]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[34]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[35]  Yann LeCun,et al.  Generalization and network design strategies , 1989 .

[36]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[37]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[38]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Amnon Shashua,et al.  Deep SimNets , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[42]  Lihua Xie,et al.  Non-iterative RGB-D-inertial Odometry , 2017 .

[43]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[44]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[46]  Rongrong Ji,et al.  Modulated Convolutional Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[48]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[49]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[50]  Le Song,et al.  Decoupled Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Dieter Fox,et al.  Kernel Descriptors for Visual Recognition , 2010, NIPS.

[52]  Chen Wang,et al.  Correlation Flow: Robust Optical Flow Using Kernel Cross-Correlators , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[53]  Ramesh Raskar,et al.  Pairwise Confusion for Fine-Grained Visual Classification , 2017, ECCV.

[54]  Chen Wang,et al.  Non-iterative SLAM , 2017, 2017 18th International Conference on Advanced Robotics (ICAR).

[55]  Petros Daras,et al.  Non-linear Convolution Filters for CNN-Based Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[57]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[58]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[59]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.