BasisConv: A method for compressed representation and learning in CNNs

It is well known that Convolutional Neural Networks (CNNs) have significant redundancy in their filter weights. Various methods have been proposed in the literature to compress trained CNNs. These include techniques like pruning weights, filter quantization and representing filters in terms of a basis functions. Our approach falls in this latter class of strategies, but is distinct in that that we show both compressed learning and representation can be achieved without significant modifications of popular CNN architectures. Specifically, any convolution layer of the CNN is easily replaced by two successive convolution layers: the first is a set of fixed filters (that represent the knowledge space of the entire layer and do not change), which is followed by a layer of one-dimensional filters (that represent the learned knowledge in this space). For the pre-trained networks, the fixed layer is just the truncated eigen-decompositions of the original filters. The 1D filters are initialized as the weights of linear combination, but are fine-tuned to recover any performance loss due to the truncation. For training networks from scratch, we use a set of random orthogonal fixed filters (that never change), and learn the 1D weight vector directly from the labeled data. Our method substantially reduces i) the number of learnable parameters during training, and ii) the number of multiplication operations and filter storage requirements during implementation. It does so without requiring any special operators in the convolution layer, and extends to all known popular CNN architectures. We apply our method to four well known network architectures trained with three different data sets. Results show a consistent reduction in i) the number of operations by up to a factor of 5, and ii) number of learnable parameters by up to a factor of 18, with less than 3% drop in performance on the CIFAR100 dataset.

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[3]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[4]  Fabio Galasso,et al.  Adversarial Network Compression , 2018, ECCV Workshops.

[5]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[6]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Yoshua Bengio,et al.  BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Ming Yang,et al.  Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.

[12]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[13]  Vincent Vanhoucke,et al.  Improving the speed of neural networks on CPUs , 2011 .

[14]  Tao Zhang,et al.  A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.

[15]  Yixin Chen,et al.  Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[16]  A. Robert Calderbank,et al.  DCFNet: Deep Neural Network with Decomposed Convolutional Filters , 2018, ICML.

[17]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[19]  Jian Cheng,et al.  Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[21]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[22]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[23]  R. Venkatesh Babu,et al.  Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.