论文信息 - Neural Network Compression via Additive Combination of Reshaped, Low-Rank Matrices

Neural Network Compression via Additive Combination of Reshaped, Low-Rank Matrices

In the last five years, neural network compression has become an important problem due to the increasing necessity of running complex networks on small devices. We consider a form of network compression that has not been explored before: an additive combination of reshaped low-rank matrices. That is, given the weights of a neural network, we constrain them as a sum of differently shaped low-rank matrices to reduce the network's size and inference demands. Computationally, this is a hard problem involving integer variables (ranks) and continuous variables (weights), as well as nonlinear loss and constraints. We formulate it as a model selection over the family of compressed models and give an optimization algorithm that efficiently handles the inherent combinatorial structure. This results in a “Learning-Compression” algorithm which alternates between a standard machine learning step and a step involving signal compression. We demonstrate the effectiveness of the proposed compression scheme and the corresponding algorithm on multiple networks and datasets.

Miguel Á. Carreira-Perpiñán | Yerlan Idelbayev | M. A. Carreira-Perpiñán | Yerlan Idelbayev

[1] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[2] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3] Miguel Á. Carreira-Perpiñán,et al. Model compression as constrained optimization, with application to neural nets. Part I: general framework , 2017, ArXiv.

[4] Miguel Á. Carreira-Perpiñán,et al. "Learning-Compression" Algorithms for Neural Net Pruning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Naiyan Wang,et al. Data-Driven Sparse Structure Selection for Deep Neural Networks , 2017, ECCV.

[6] Liujuan Cao,et al. Towards Optimal Structured CNN Pruning via Generative Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[9] Bingbing Ni,et al. Variational Convolutional Neural Network Pruning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Andrew Zisserman,et al. Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[12] Chong-Min Kyung,et al. Efficient Neural Network Compression , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Miguel Á. Carreira-Perpiñán,et al. Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Yerlan Idelbayev,et al. A flexible, extensible software framework for model compression based on the LC algorithm , 2020, ArXiv.

[15] Jungong Han,et al. Approximated Oracle Filter Pruning for Destructive CNN Width Optimization , 2019, ICML.

[16] Ping Liu,et al. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Jingyu Wang,et al. OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Miguel Á. Carreira-Perpiñán,et al. Model compression as constrained optimization, with application to neural nets. Part II: quantization , 2017, ArXiv.

[19] Rongrong Ji,et al. HRank: Filter Pruning Using High-Rank Feature Map , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Yuhui Xu,et al. Trained Rank Pruning for Efficient Deep Neural Networks , 2018, 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS).

[21] Stephen J. Wright,et al. Numerical Optimization (Springer Series in Operations Research and Financial Engineering) , 2000 .

[22] Miguel Á. Carreira-Perpiñán,et al. More General and Effective Model Compression via an Additive Combination of Compressions , 2021, ECML/PKDD.

[23] Chong Li,et al. Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks , 2018, ECCV.

[24] Jian Sun,et al. Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[26] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.

[27] Larry S. Davis,et al. NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.