Iterative low-rank approximation based on the redundancy of each network layer

Low rank approximation is an effective method in deep neural network (DNN) compression. In view of the fact that the redundancy information content of different network layers is different, a novel iterative low-rank approximation method based on the redundancy of each network layer is proposed. By giving priority to the network layer with higher redundancy, the loss of intrinsic information in each network layer is expected to be reduced and the performance of the compressed model is improved. Experimental results show that the performance of compressed model obtained by this method is improved with a slight reduction in compression ratio. It can be concluded that the proposed method can better retain intrinsic information in the pre-training network.

[1]  Ivan V. Oseledets,et al.  Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.

[2]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[3]  Yousef Saad,et al.  Higher Order Orthogonal Iteration of Tensors (HOOI) and its Relation to PCA and GLRAM , 2007, SDM.

[4]  Aline Roumy,et al.  Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[5]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[6]  Jian Cheng,et al.  Accelerating Convolutional Neural Networks for Mobile Applications , 2016, ACM Multimedia.

[7]  Andrzej Cichocki,et al.  Automated Multi-Stage Compression of Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[8]  George K. Thiruvathukal,et al.  A Survey of Methods for Low-Power Deep Learning and Computer Vision , 2020, 2020 IEEE 6th World Forum on Internet of Things (WF-IoT).

[9]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[10]  Alessandro Laio,et al.  Intrinsic dimension of data representations in deep neural networks , 2019, NeurIPS.

[11]  Eunhyeok Park,et al.  Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications , 2015, ICLR.

[12]  Michael Elad,et al.  On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[13]  Zihao Chen,et al.  Accelerating Training using Tensor Decomposition , 2019, ArXiv.

[14]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[15]  Tao Zhang,et al.  Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges , 2018, IEEE Signal Processing Magazine.

[16]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Shinichi Nakajima,et al.  Global analytic solution of fully-observed variational Bayesian matrix factorization , 2013, J. Mach. Learn. Res..