A Strong Baseline for Crowd Counting and Unsupervised People Localization

In this paper, we explore a strong baseline for crowd counting and an unsupervised people localization algorithm based on estimated density maps. Firstly, existing methods achieve state-of-the-art performance based on different backbones and kinds of training tricks. We collect different backbones and training tricks and evaluate the impact of changing them and develop an efficient pipeline for crowd counting, which decreases MAE and RMSE significantly on multiple datasets. We also propose a clustering algorithm named isolated KMeans to locate the heads in density maps. This method can divide the density maps into subregions and find the centers under local count constraints without training any parameter and can be integrated with existing methods easily.

[1]  Toby P. Breckon,et al.  Segmentation Guided Attention Network for Crowd Counting via Curriculum Learning , 2019, ArXiv.

[2]  Fei Su,et al.  Scale Aggregation Network for Accurate and Efficient Crowd Counting , 2018, ECCV.

[3]  Qijun Chen,et al.  Revisiting Perspective Information for Efficient Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[6]  Yadong Mu,et al.  Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Wei Lin,et al.  Learning From Synthetic Data for Crowd Counting in the Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Qijun Zhao,et al.  Point in, Box Out: Beyond Counting Persons in Crowds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Pascal Fua,et al.  Context-Aware Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Baoyuan Wu,et al.  Residual Regression With Semantic Prior for Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yongdong Zhang,et al.  Dense Scale Network for Crowd Counting , 2019, ICMR.

[13]  Zhi Zhang,et al.  Bag of Tricks for Image Classification with Convolutional Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Toby P. Breckon,et al.  Crowd Counting via Segmentation Guided Attention Networks and Curriculum Loss , 2019, IEEE Transactions on Intelligent Transportation Systems.

[15]  Ling Shao,et al.  Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[17]  Changxin Gao,et al.  Scale Pyramid Network for Crowd Counting , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[19]  Li Pan,et al.  ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[21]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.