Multi-Scale Dense Networks for Resource Efficient Image Classification

In this paper we investigate image classification with computational resource limits at test time. Two such settings are: 1. anytime classification, where the network's prediction for a test example is progressively updated, facilitating the output of a prediction at any time; and 2. budgeted batch classification, where a fixed amount of computation is available to classify a set of examples that can be spent unevenly across "easier" and "harder" inputs. In contrast to most prior work, such as the popular Viola and Jones algorithm, our approach is based on convolutional neural networks. We train multiple classifiers with varying resource demands, which we adaptively apply during test time. To maximally re-use computation between the classifiers, we incorporate them as early-exits into a single deep convolutional neural network and inter-connect them with dense connectivity. To facilitate high quality classification early on, we use a two-dimensional multi-scale network architecture that maintains coarse and fine level features all-throughout the network. Experiments on three image-classification tasks demonstrate that our framework substantially improves the existing state-of-the-art in both settings.

[1]  Venkatesh Saligrama,et al.  Feature-Budgeted Random Forest , 2015, ICML.

[2]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Gregory J. Wolff,et al.  Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[6]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[8]  J. Andrew Bagnell,et al.  SpeedBoost: Anytime Prediction with Uniform Near-Optimality , 2012, AISTATS.

[9]  Xiaolin Hu,et al.  Interlinked Convolutional Neural Networks for Face Parsing , 2015, ISNN.

[10]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[11]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[12]  Li Zhang,et al.  Spatially Adaptive Computation Time for Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[15]  Venkatesh Saligrama,et al.  Adaptive Neural Networks for Fast Test-Time Prediction , 2017, ArXiv.

[16]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[17]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Jakob Verbeek,et al.  Convolutional Neural Fabrics , 2016, NIPS.

[19]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[20]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[21]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[22]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[23]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[26]  Venkatesh Saligrama,et al.  Efficient Learning by Directed Acyclic Graph For Resource Constrained Prediction , 2015, NIPS.

[27]  Venkatesh Saligrama,et al.  Supervised Sequential Classification Under Budget Constraints , 2013, AISTATS.

[28]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[29]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[30]  Kilian Q. Weinberger,et al.  The Greedy Miser: Learning under Test-time Budgets , 2012, ICML.

[31]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[32]  Trevor Darrell,et al.  Anytime Recognition of Objects and Scenes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Alex Graves,et al.  Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.

[34]  Yixin Chen,et al.  Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[35]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Matt J. Kusner,et al.  Cost-Sensitive Tree of Classifiers , 2012, ICML.

[37]  Stéphane Mallat,et al.  Multiscale Hierarchical Convolutional Networks , 2017, ArXiv.

[38]  Augustus Odena,et al.  Changing Model Behavior at Test-Time Using Reinforcement Learning , 2017, ICLR.