Dynamic Group Convolution for Accelerating Convolutional Neural Networks

Replacing normal convolutions with group convolutions can significantly increase the computational efficiency of modern deep convolutional networks, which has been widely adopted in compact network architecture designs. However, existing group convolutions undermine the original network structures by cutting off some connections permanently resulting in significant accuracy degradation. In this paper, we propose dynamic group convolution (DGC) that adaptively selects which part of input channels to be connected within each group for individual samples on the fly. Specifically, we equip each group with a small feature selector to automatically select the most important input channels conditioned on the input images. Multiple groups can adaptively capture abundant and complementary visual/semantic features for each input image. The DGC preserves the original network structure and has similar computational efficiency as the conventional group convolution simultaneously. Extensive experiments on multiple image classification benchmarks including CIFAR-10, CIFAR-100 and ImageNet demonstrate its superiority over the existing group convolution techniques and dynamic execution methods. The code is available at this https URL.

[1]  Wencong Xiao,et al.  SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yi Yang,et al.  Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks , 2018, IJCAI.

[3]  Yi Yang,et al.  More is Less: A More Complicated Network with Less Inference Complexity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jinjun Xiong,et al.  On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks , 2018, ICLR.

[5]  Ping Liu,et al.  Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Serge J. Belongie,et al.  Convolutional Networks with Adaptive Inference Graphs , 2017, International Journal of Computer Vision.

[7]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Cheng-Zhong Xu,et al.  Dynamic Channel Pruning: Feature Boosting and Suppression , 2018, ICLR.

[9]  Lei Zhang,et al.  Variational Bayesian Dropout With a Hierarchical Prior , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jia Deng,et al.  Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution , 2017, AAAI.

[11]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Rongrong Ji,et al.  Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Sanguthevar Rajasekaran,et al.  AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters , 2019, NeurIPS.

[14]  Xiaodan Liang Learning Personalized Modular Network Guided by Structured Knowledge , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Kilian Q. Weinberger,et al.  CondenseNet: An Efficient DenseNet Using Learned Group Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17]  U Kang,et al.  Knowledge Extraction with No Observable Data , 2019, NeurIPS.

[18]  Zhiru Zhang,et al.  Channel Gating Neural Networks , 2018, NeurIPS.

[19]  Christoph H. Lampert,et al.  Distillation-Based Training for Multi-Exit Architectures , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Liujuan Cao,et al.  Towards Optimal Structured CNN Pruning via Generative Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jing Liu,et al.  Discrimination-aware Channel Pruning for Deep Neural Networks , 2018, NeurIPS.

[22]  Jiaxiang Wu,et al.  Collaborative Channel Pruning for Deep Networks , 2019, ICML.

[23]  Jingdong Wang,et al.  Interleaved Group Convolutions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[27]  Philip H. S. Torr,et al.  SNIP: Single-shot Network Pruning based on Connection Sensitivity , 2018, ICLR.

[28]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Shiguang Shan,et al.  Fully Learnable Group Convolution for Acceleration of Deep Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Kilian Q. Weinberger,et al.  Multi-Scale Dense Networks for Resource Efficient Image Classification , 2017, ICLR.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Dong Liu,et al.  IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks , 2018, BMVC.

[36]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Christoph H. Lampert,et al.  Towards Understanding Knowledge Distillation , 2019, ICML.

[38]  Jiwen Lu,et al.  Runtime Neural Pruning , 2017, NIPS.

[39]  Sangdoo Yun,et al.  A Comprehensive Overhaul of Feature Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[41]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[42]  Masami Kato,et al.  Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[43]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Xin Wang,et al.  SkipNet: Learning Dynamic Routing in Convolutional Networks , 2017, ECCV.

[45]  Jian Sun,et al.  Efficient and accurate approximations of nonlinear convolutional networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Larry S. Davis,et al.  BlockDrop: Dynamic Inference Paths in Residual Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[48]  Xiaolin Hu,et al.  Knowledge Distillation via Route Constrained Optimization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[51]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[52]  Rongrong Ji,et al.  Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[54]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[55]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[56]  Charbel Sakr,et al.  Per-Tensor Fixed-Point Quantization of the Back-Propagation Algorithm , 2018, ICLR.

[57]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[58]  Venkatesh Saligrama,et al.  Adaptive Neural Networks for Efficient Inference , 2017, ICML.

[59]  Andreas E. Savakis,et al.  Cascaded Projection: End-To-End Network Compression and Acceleration , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[61]  Daniel Soudry,et al.  Post training 4-bit quantization of convolutional networks for rapid-deployment , 2018, NeurIPS.