Dual Skipping Networks

Inspired by the recent neuroscience studies on the left-right asymmetry of the human brain in processing low and high spatial frequency information, this paper introduces a dual skipping network which carries out coarse-to-fine object categorization. Such a network has two branches to simultaneously deal with both coarse and fine-grained classification tasks. Specifically, we propose a layer-skipping mechanism that learns a gating network to predict which layers to skip in the testing stage. This layer-skipping mechanism endows the network with good flexibility and capability in practice. Evaluations are conducted on several widely used coarse-to-fine object categorization benchmarks, and promising results are achieved by our proposed network model.

[1]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[2]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Restarts , 2016, ArXiv.

[3]  Li Zhang,et al.  Spatially Adaptive Computation Time for Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[6]  Lei Guo,et al.  Learning coarse-to-fine sparselets for efficient object detection and scene classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Brendan J. Frey,et al.  Adaptive dropout for training deep neural networks , 2013, NIPS.

[8]  Tao Mei,et al.  Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[10]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[11]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[12]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Zhuowen Tu,et al.  Deeply-Supervised Nets , 2014, AISTATS.

[16]  Silvio Savarese,et al.  A coarse-to-fine model for 3D pose estimation and sub-category recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[20]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[21]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Robinson Piramuthu,et al.  HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[24]  Madeleine Turgeon,et al.  Right brain left brain reflexology : a self-help approach to balancing life energies with color, sound, and pressure-point techniques , 1994 .

[25]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[26]  Venkatesh Saligrama,et al.  Adaptive Neural Networks for Efficient Inference , 2017, ICML.

[27]  Joelle Pineau,et al.  Conditional Computation in Neural Networks for faster models , 2015, ArXiv.

[28]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[29]  Jack L. Gallant,et al.  A voxel-wise encoding model for early visual areas decodes mental images of remembered scenes , 2015, NeuroImage.

[30]  Jianfeng Feng,et al.  Learning alters theta-nested gamma oscillations in inferotemporal cortex , 2009 .

[31]  A. Toga,et al.  Mapping brain asymmetry , 2003, Nature Reviews Neuroscience.

[32]  Jianfeng Feng,et al.  A Novel Extended Granger Causal Model Approach Demonstrates Brain Hemispheric Differences during Face Recognition Learning , 2009, PLoS Comput. Biol..

[33]  J. Bullier Integrated model of visual processing , 2001, Brain Research Reviews.

[34]  Lin Sun,et al.  Feedback Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  D. Hassabis,et al.  Neuroscience-Inspired Artificial Intelligence , 2017, Neuron.

[36]  Louise Kauffmann,et al.  The neural bases of spatial frequency processing during scene perception , 2014, Front. Integr. Neurosci..

[37]  Zhiqiang Shen,et al.  Multiple Granularity Descriptors for Fine-Grained Categorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Qi Tian,et al.  Picking Deep Filter Responses for Fine-Grained Image Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[41]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[42]  Masahiko Watanabe,et al.  Left-right asymmetry of the hippocampal synapses with differential subunit allocation of glutamate receptors , 2008, Proceedings of the National Academy of Sciences.

[43]  Ryosuke Kawakami,et al.  Asymmetrical Allocation of NMDA Receptor ε2 Subunits in Hippocampal Circuitry , 2003, Science.

[44]  Giorgio Vallortigara,et al.  Origins of the left & right brain. , 2009, Scientific American.

[45]  Christoph M. Michel,et al.  The Neural Substrates and Timing of Top–Down Processes during Coarse-to-Fine Categorization of Visual Scenes: A Combined fMRI and ERP Study , 2010, Journal of Cognitive Neuroscience.

[46]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  E. Halgren,et al.  Top-down facilitation of visual recognition. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[48]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[49]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[50]  Serge J. Belongie,et al.  Residual Networks Behave Like Ensembles of Relatively Shallow Networks , 2016, NIPS.

[51]  Augustus Odena,et al.  Changing Model Behavior at Test-Time Using Reinforcement Learning , 2017, ICLR.