Dense xUnit Networks

Deep net architectures have constantly evolved over the past few years, leading to significant advancements in a wide array of computer vision tasks. However, besides high accuracy, many applications also require a low computational load and limited memory footprint. To date, efficiency has typically been achieved either by architectural choices at the macro level (e.g. using skip connections or pruning techniques) or modifications at the level of the individual layers (e.g. using depth-wise convolutions or channel shuffle operations). Interestingly, much less attention has been devoted to the role of the activation functions in constructing efficient nets. Recently, Kligvasser et al. showed that incorporating spatial connections within the activation functions, enables a significant boost in performance in image restoration tasks, at any given budget of parameters. However, the effectiveness of their xUnit module has only been tested on simple small models, which are not characteristic of those used in high-level vision tasks. In this paper, we adopt and improve the xUnit activation, show how it can be incorporated into the DenseNet architecture, and illustrate its high effectiveness for classification and image restoration tasks alike. While the DenseNet architecture is extremely efficient to begin with, our dense xUnit net (DxNet) can typically achieve the same performance with far fewer parameters. For example, on ImageNet, our DxNet outperforms a ReLU-based DenseNet having 30% more parameters and achieves state-of-the-art results for this budget of parameters. Furthermore, in denoising and super-resolution, DxNet significantly improves upon all existing lightweight solutions, including the xUnit-based nets of Kligvasser et al.

[1]  Yurong Chen,et al.  Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[2]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[4]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[5]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[9]  Yifan Wang,et al.  A Fully Progressive Approach to Single-Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[11]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[12]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[13]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[15]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[16]  Jian Yang,et al.  MemNet: A Persistent Memory Network for Image Restoration , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[19]  Jürgen Schmidhuber,et al.  Flat Minima , 1997, Neural Computation.

[20]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[21]  Tong Tong,et al.  Image Super-Resolution Using Dense Skip Connections , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Narendra Ahuja,et al.  Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[24]  Ameya Prabhu,et al.  Deep Expander Networks: Efficient Deep Networks from Graph Theory , 2017, ECCV.

[25]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[26]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[29]  Eirikur Agustsson,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[30]  Kyung-Ah Sohn,et al.  Fast, Accurate, and, Lightweight Super-Resolution with Cascading Residual Network , 2018, ECCV.

[31]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[34]  Gang Sun,et al.  Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks , 2018, NeurIPS.

[35]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[36]  Song Han,et al.  Trained Ternary Quantization , 2016, ICLR.

[37]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[38]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Wangmeng Zuo,et al.  Learning Deep CNN Denoiser Prior for Image Restoration , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[42]  Tomer Michaeli,et al.  xUnit: Learning a Spatial Activation Function for Efficient Image Restoration , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Lei Zhang,et al.  FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising , 2017, IEEE Transactions on Image Processing.

[44]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[45]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[46]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Victor S. Lempitsky,et al.  Fast ConvNets Using Group-Wise Brain Damage , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[51]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.