Weakly Supervised Crowd-Wise Attention For Robust Crowd Counting

Due to a wide range of various application scenes, robust crowd counting is still quite difficult and the performance is far from being satisfied. In this paper, we propose a novel robust crowd counting method by introducing a weakly supervised crowd-wise attention network. The proposed work improves the counting accuracy and robustness by: i) Weakly-supervised crowd segmentation. With a generated segmentation label using motion-guided region-growth, both the appearance feature of one-labeled image and motion features abstracted from its adjacent unlabeled frames, are combined to implement weakly supervised crowd region segmentation, with which active crowd region can be finely perceived from different background disturbances. ii) More accurate spatial attention. We generate a spatial attention map based on the active crowd segmentation, which is used to reweigh the appearance feature to achieve attention-based density estimation. Evaluation of the widely used World Expo’ 10 dataset shows that the proposed work can achieve state-of-the-art performance on both accuracy and robustness.

[1]  Changyin Sun,et al.  Crowd Counting via Weighted VLAD on a Dense Attribute Feature Map , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Chongyang Zhang,et al.  Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yunchao Wei,et al.  Perceptual Generative Adversarial Networks for Small Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jian Yang,et al.  Occluded Pedestrian Detection Through Guided Attention in CNNs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Bingbing Ni,et al.  Crowd Counting via Adversarial Cross-Scale Consistency Pursuit , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Wenjun Zhang,et al.  Towards Locally Consistent Object Counting with Constrained Multi-stage Convolutional Neural Networks , 2018, ACCV.

[7]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Vishal M. Patel,et al.  Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  R. Venkatesh Babu,et al.  Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Deyu Meng,et al.  DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Lior Wolf,et al.  Learning to Count with CNN Boosting , 2016, ECCV.

[16]  Dit-Yan Yeung,et al.  Spatiotemporal Modeling for Crowd Counting in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Andrew Zisserman,et al.  Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Shaogang Gong,et al.  Feature Mining for Localised Crowd Counting , 2012, BMVC.

[19]  Liang Lin,et al.  Crowd Counting using Deep Recurrent Spatial-Aware Network , 2018, IJCAI.

[20]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.