论文信息 - In Teacher We Trust: Learning Compressed Models for Pedestrian Detection

In Teacher We Trust: Learning Compressed Models for Pedestrian Detection

Deep convolutional neural networks continue to advance the state-of-the-art in many domains as they grow bigger and more complex. It has been observed that many of the parameters of a large network are redundant, allowing for the possibility of learning a smaller network that mimics the outputs of the large network through a process called Knowledge Distillation. We show, however, that standard Knowledge Distillation is not effective for learning small models for the task of pedestrian detection. To improve this process, we introduce a higher-dimensional hint layer to increase information flow. We also estimate the variance in the outputs of the large network and propose a loss function to incorporate this uncertainty. Finally, we attempt to boost the complexity of the small network without increasing its size by using as input hand-designed features that have been demonstrated to be effective for pedestrian detection. We succeed in training a model that contains $400\times$ fewer parameters than the large network while outperforming AlexNet on the Caltech Pedestrian Dataset.

[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2] Yixin Chen,et al. Compressing Convolutional Neural Networks , 2015, ArXiv.

[3] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[4] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5] Shih-Fu Chang,et al. An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7] Song Han,et al. A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding , 2015 .

[8] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[9] Ivan Oseledets,et al. Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[10] Bernt Schiele,et al. Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[11] Misha Denil,et al. Predicting Parameters in Deep Learning , 2014 .

[12] Pietro Perona,et al. Integral Channel Features , 2009, BMVC.

[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Marian Verhelst,et al. Energy-efficient ConvNets through approximate computing , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[15] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[16] Guido Sanguinetti,et al. Advances in Neural Information Processing Systems 24 , 2011 .

[17] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[18] Alexander Novikov,et al. Tensorizing Neural Networks , 2015, NIPS.

[19] Liang Lin,et al. Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[20] Gregory J. Wolff,et al. Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[21] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[22] Shuicheng Yan,et al. Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[23] Bernt Schiele,et al. Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[25] R. Venkatesh Babu,et al. Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.

[26] Rogério Schmidt Feris,et al. A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[27] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[28] Pietro Perona,et al. Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.

[30] Bernt Schiele,et al. Taking a deeper look at pedestrians , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Pietro Perona,et al. Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.