Dense Crowd Counting with Capsule Networks

In this paper, we proposed and evaluated the adoption of a capsule network-based (CapsNet-based) model rather than the convolutional neural network-based (CNN-based) models which are predominant in crowd counting tasks. The aim is to join the task of generating a high-quality density map from a single image along with producing a more precise estimate of the number of people. CapsNet-based model has a strong capacity of representation and a powerful dynamic routing mechanism that could address the drawback of a limited number of training samples. The replacement of the scalar values of CNN by vectors when using a CapsNet allows learning more discriminative features, which contributes to generating high-quality density maps, and thus a more precise number of individuals in crowd scenes. Experimental results show that our proposal presents competitive results concerning state-of-the-art, but with a 59.2% reduction in the number of parameters.

[1]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[2]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Vishal M. Patel,et al.  HA-CCN: Hierarchical Attention-Based Crowd Counting Network , 2019, IEEE Transactions on Image Processing.

[4]  Hélio Pedrini,et al.  Where are the People? A Multi-Stream Convolutional Neural Network for Crowd Counting via Density Map from Complex Images , 2019, 2019 International Conference on Systems, Signals and Image Processing (IWSSIP).

[5]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Serge J. Belongie,et al.  Counting Crowded Moving Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Sridha Sridharan,et al.  Crowd Counting Using Multiple Local Features , 2009, 2009 Digital Image Computing: Techniques and Applications.

[8]  Ulas Bagci,et al.  Capsules for Object Segmentation , 2018, ArXiv.

[9]  Changxin Gao,et al.  Scale Pyramid Network for Crowd Counting , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[10]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Vishal M. Patel,et al.  CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Tieniu Tan,et al.  Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection , 2008, 2008 19th International Conference on Pattern Recognition.

[14]  P. V. V. Kishore,et al.  Crowd Density Analysis and tracking , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[15]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Kinjal Mistree,et al.  A review on abnormal crowd behavior detection , 2017, 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS).

[18]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Daw-Tung Lin,et al.  Crowd Density Estimation Based on a Modified Multicolumn Convolutional Neural Network , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[22]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[23]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Sridha Sridharan,et al.  An evaluation of crowd counting methods, features and regression models , 2015, Comput. Vis. Image Underst..