Mixed Link Networks

Basing on the analysis by revealing the equivalence of modern networks, we find that both ResNet and DenseNet are essentially derived from the same "dense topology", yet they only differ in the form of connection -- addition (dubbed "inner link") vs. concatenation (dubbed "outer link"). However, both two forms of connections have the superiority and insufficiency. To combine their advantages and avoid certain limitations on representation learning, we present a highly efficient and modularized Mixed Link Network (MixNet) which is equipped with flexible inner link and outer link modules. Consequently, ResNet, DenseNet and Dual Path Network (DPN) can be regarded as a special case of MixNet, respectively. Furthermore, we demonstrate that MixNets can achieve superior efficiency in parameter over the state-of-the-art architectures on many competitive datasets like CIFAR-10/100, SVHN and ImageNet.

[1]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[2]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[3]  Zhuowen Tu,et al.  On the Connection of Deep Fusion to Ensembling , 2016, ArXiv.

[4]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Yann LeCun,et al.  Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[10]  Shuicheng Yan,et al.  Dual Path Networks , 2017, NIPS.

[11]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[12]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[13]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[15]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[17]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[19]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[20]  Jingdong Wang,et al.  Interleaved Group Convolutions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yueting Zhuang,et al.  Deep Convolutional Neural Networks with Merge-and-Run Mappings , 2016, IJCAI.

[24]  Xiaolin Hu,et al.  Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Wenjun Zeng,et al.  Deeply-Fused Nets , 2016, ArXiv.

[26]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[27]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[29]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[30]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.