Runtime Neural Pruning

In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime. Unlike existing neural pruning methods which produce a fixed pruned model for deployment, our method preserves the full ability of the original network and conducts pruning according to the input image and current feature maps adaptively. The pruning is performed in a bottom-up, layer-by-layer manner, which we model as a Markov decision process and use reinforcement learning for training. The agent judges the importance of each convolutional kernel and conducts channel-wise pruning conditioned on different samples, where the network is pruned more when the image is easier for the task. Since the ability of network is fully preserved, the balance point is easily adjustable according to the available resources. Our method can be applied to off-the-shelf network structures and reach a better tradeoff between speed and accuracy, especially with a large pruning rate.

[1]  Rui Peng,et al.  Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.

[2]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[3]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[4]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[5]  Michael L. Littman,et al.  Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.

[6]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[7]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Peter I. Corke,et al.  Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control , 2015, ICRA 2015.

[9]  Nikko Strom,et al.  Phoneme probability estimation with dynamic sparsely connected artificial neural networks , 1997 .

[10]  Ramakant Nevatia,et al.  ProNet: Learning to Propose Object-Specific Boxes for Cascaded Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Hugo Larochelle,et al.  Dynamic Capacity Networks , 2015, ICML.

[12]  Balázs Kégl,et al.  Fast classification using sparse decision DAGs , 2012, ICML.

[13]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[15]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[16]  Trevor Darrell,et al.  Timely Object Recognition , 2012, NIPS.

[17]  David Chiang,et al.  Auto-Sizing Neural Networks: With Applications to n-gram Language Models , 2015, EMNLP.

[18]  Li Zhang,et al.  Spatially Adaptive Computation Time for Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Joelle Pineau,et al.  Conditional Computation in Neural Networks for faster models , 2015, ArXiv.

[20]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[22]  Augustus Odena,et al.  Changing Model Behavior at Test-Time Using Reinforcement Learning , 2017, ICLR.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Steven Bohez,et al.  The cascading neural network: building the Internet of Smart Things , 2017, Knowledge and Information Systems.

[25]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[26]  Shree K. Nayar,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Describable Visual Attributes for Face Verification and Image Search , 2022 .

[27]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[28]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  He He,et al.  Imitation Learning by Coaching , 2012, NIPS.

[30]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[31]  R Bellman,et al.  DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Wonyong Sung,et al.  Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..

[33]  Lorien Y. Pratt,et al.  Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.

[34]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[35]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[36]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[37]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[38]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[39]  Venkatesh Saligrama,et al.  Adaptive Neural Networks for Fast Test-Time Prediction , 2017, ArXiv.

[40]  Jia Deng,et al.  Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution , 2017, AAAI.

[41]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[42]  Jian Sun,et al.  Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[44]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[45]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.