A concatenating framework of shortcut convolutional neural networks

It is well accepted that convolutional neural networks play an important role in learning excellent features for image classification and recognition. However, in tradition they only allow adjacent layers connected, limiting integration of multi-scale information. To further improve their performance, we present a concatenating framework of shortcut convolutional neural networks. This framework can concatenate multi-scale features by shortcut connections to the fully-connected layer that is directly fed to the output layer. We do a large number of experiments to investigate performance of the shortcut convolutional neural networks on many benchmark visual datasets for different tasks. The datasets include AR, FERET, FaceScrub, CelebA for gender classification, CUReT for texture classification, MNIST for digit recognition, and CIFAR-10 for object recognition. Experimental results show that the shortcut convolutional neural networks can achieve better results than the traditional ones on these tasks, with more stability in different settings of pooling schemes, activation functions, optimizations, initializations, kernel numbers and kernel sizes.

[1]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Haibo He,et al.  Goal Representation Heuristic Dynamic Programming on Maze Navigation , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Wei Shen,et al.  Multi-crop Convolutional Neural Networks for lung nodule malignancy suspiciousness classification , 2017, Pattern Recognit..

[5]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[7]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[9]  Trevor Darrell,et al.  Do Convnets Learn Correspondence? , 2014, NIPS.

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[12]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Paul F. Whelan,et al.  Using filter banks in Convolutional Neural Networks for texture classification , 2016, Pattern Recognit. Lett..

[15]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Shree K. Nayar,et al.  Reflectance and texture of real-world surfaces , 1999, TOGS.

[17]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[18]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Xiaogang Wang,et al.  Hybrid Deep Learning for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Haizhou Li,et al.  Rapid Feedforward Computation by Temporal Encoding and Learning With Spiking Neurons , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Edilson de Aguiar,et al.  Facial expression recognition with Convolutional Neural Networks: Coping with few data and the training sample order , 2017, Pattern Recognit..

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yann LeCun,et al.  Traffic sign recognition with multi-scale Convolutional Networks , 2011, The 2011 International Joint Conference on Neural Networks.

[26]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[27]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Shuicheng Yan,et al.  Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods , 2016, ArXiv.

[29]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[32]  Jian Sun,et al.  Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[34]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).