Smaller Models, Better Generalization

Reducing network complexity has been a major research focus in recent years with the advent of mobile technology. Convolutional Neural Networks that perform various vision tasks without memory overhaul is the need of the hour. This paper focuses on qualitative and quantitative analysis of reducing the network complexity using an upper bound on the Vapnik-Chervonenkis dimension, pruning, and quantization. We observe a general trend in improvement of accuracies as we quantize the models. We propose a novel loss function that helps in achieving considerable sparsity at comparable accuracies to that of dense models. We compare various regularizations prevalent in the literature and show the superiority of our method in achieving sparser models that generalize well.

[1]  Abhisek Kundu,et al.  Mixed Low-precision Deep Learning Inference using Dynamic Fixed Point , 2017, ArXiv.

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  Dong Wang,et al.  Pruning deep neural networks by optimal brain damage , 2014, INTERSPEECH.

[4]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[5]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[6]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[9]  Pushmeet Kohli,et al.  Memory Bounded Deep Convolutional Networks , 2014, ArXiv.

[10]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[11]  Erich Elsen,et al.  Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.

[12]  Bhiksha Raj,et al.  The Incredible Shrinking Neural Network: New Perspectives on Learning Representations Through The Lens of Pruning , 2017, ArXiv.

[13]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[15]  Hao Zhou,et al.  Less Is More: Towards Compact CNNs , 2016, ECCV.

[16]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[19]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[21]  Yoshua Bengio,et al.  Neural Networks with Few Multiplications , 2015, ICLR.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Justin Romberg,et al.  Net-Trim: A Layer-wise Convex Pruning of Deep Neural Networks , 2016, ArXiv.

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[26]  Afshin Abdi,et al.  Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee , 2016, NIPS.

[27]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[28]  Xiaogang Wang,et al.  Sparsifying Neural Network Connections for Face Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Paris Smaragdis,et al.  NoiseOut: A Simple Way to Prune Neural Networks , 2016, ArXiv.

[30]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[31]  R. Venkatesh Babu,et al.  Training Sparse Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  R. Venkatesh Babu,et al.  Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.

[34]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[35]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[36]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[37]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).