Aggregated channels network for real-time pedestrian detection

Convolutional neural networks (CNNs) have demonstrated their superiority in numerous computer vision tasks, yet their computational cost results prohibitive for many real-time applications such as pedestrian detection which is usually performed on low-consumption hardware. In order to alleviate this drawback, most strategies focus on using a two-stage cascade approach. Essentially, in the first stage a fast method generates a significant but reduced amount of high quality proposals that later, in the second stage, are evaluated by the CNN. In this work, we propose a novel detection pipeline that further benefits from the two-stage cascade strategy. More concretely, the enriched and subsequently compressed features used in the first stage are reused as the CNN input. As a consequence, a simpler network architecture, adapted for such small input sizes, allows to achieve real-time performance and obtain results close to the state-of-the-art while running significantly faster without the use of GPU. In particular, considering that the proposed pipeline runs in frame rate, the achieved performance is highly competitive. We furthermore demonstrate that the proposed pipeline on itself can serve as an effective proposal generator.

[1]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Joon Hee Han,et al.  Local Decorrelation For Improved Detection , 2014, ArXiv.

[3]  B. Schiele,et al.  How Far are We from Solving Pedestrian Detection? , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Anelia Angelova,et al.  Real-Time Pedestrian Detection with Deep Network Cascades , 2015, BMVC.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Xiaogang Wang,et al.  Switchable Deep Network for Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[12]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[13]  Xiaogang Wang,et al.  A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Bernt Schiele,et al.  Multi-cue onboard pedestrian detection , 2009, CVPR.

[15]  Bernt Schiele,et al.  Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Shengcai Liao,et al.  Robust Multi-resolution Pedestrian Detection in Traffic Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[18]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Shuicheng Yan,et al.  Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[20]  Xiaogang Wang,et al.  Single-Pedestrian Detection Aided by Multi-pedestrian Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[22]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[23]  David Vázquez,et al.  Random Forests of Local Experts for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Silvio Savarese,et al.  Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[26]  Nuno Vasconcelos,et al.  Learning Complexity-Aware Cascades for Deep Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[28]  Bin Yang,et al.  Convolutional Channel Features , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Xiaogang Wang,et al.  Multi-stage Contextual Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[33]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  David Gerónimo Gómez,et al.  Computer Vision Approaches to Pedestrian Detection: Visible Spectrum Survey , 2007, IbPRIA.

[35]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40]  Piotr Dollár,et al.  Crosstalk Cascades for Frame-Rate Pedestrian Detection , 2012, ECCV.

[41]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Bernt Schiele,et al.  Taking a deeper look at pedestrians , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Charless C. Fowlkes,et al.  Multiresolution Models for Object Detection , 2010, ECCV.

[44]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Xiaogang Wang,et al.  Modeling Mutual Visibility Relationship in Pedestrian Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Pietro Perona,et al.  The Fastest Pedestrian Detector in the West , 2010, BMVC.