The Analysis Between Traditional Convolution Neural Network and CapsuleNet

Convolution neural networks(CNNs) have made a series of breakthroughs in the fields of autonomous driving, robotics, medical treatment and so on. The powerful feature learning and classification abilities of CNNs have attracted wide attention of scholars. With the improvement of the networks structure and the increasing of the networks depth, the performance of tasks such as image classification, detection, recognition, and segmentation has been improved greatly. Recently, CapsuleNet, which is different from CNNs, has been proposed. In CapsuleNet, vectors replace traditional scalar points to better characterize the relationships between feature information and acquire more attributes of things. In this paper, the basic structure and algorithm principle of classical CNN models were described firstly, and then the basic neural network, network structure, parameter update and distribution process for the CapsuleNet model were analyzed and summarized. Finally, several group of training experiments were conducted to compare and analyze the image classification and feature expression canabilities of CansuleNet and CNN models.

[1]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[3]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Rob Fergus,et al.  Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.

[7]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[8]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[9]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[13]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.