Abstract In this paper, we propose a density-aware and background-aware network via multi-task learning (MTL-DB) for crowd counting. It aims to enable the model to capture the high-level semantic information of density and background via multi-task joint training, which may jointly optimize the generation of density maps. Initially, MTL-DB utilizes the first ten layers of VGG-16 with Batch Normalization as the front-end to extract primary features which will be shared by all tasks. Then, a multi-task back-end is constructed by integrating the main task of density map estimation with two auxiliary tasks, i.e., density classification and background segmentation. The density classification auxiliary task captures the density-related information with a fully connected classifier, while the background segmentation auxiliary task applies dilated convolutional network to distinguish the head area of pedestrians and background. With high-level semantic awareness, the main task generates estimated density maps utilizing normal convolutional layers. Furthermore, a multi-task joint loss is proposed to improve the quality of estimated density maps. Extensive experiments on three challenging crowd datasets (ShanghaiTech Part A & B, UCF_CC_50, and UCF_QNRF) verified the effectiveness of this multi-task learning model. MTL-DB outperformed other multi-task learning methods on the ShanghaiTech dataset, both Part A and Part B.
[1]
Vishal M. Patel,et al.
A Survey of Recent Advances in CNN-based Single Image Crowd Counting and Density Estimation
,
2017,
Pattern Recognit. Lett..
[2]
Nicu Sebe,et al.
Reverse Perspective Network for Perspective-Aware Object Counting
,
2020,
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3]
Pei Lv,et al.
Density-Aware Multi-Task Learning for Crowd Counting
,
2020,
IEEE Transactions on Multimedia.
[4]
Hao Cai,et al.
Stochastic Multi-Scale Aggregation Network for Crowd Counting
,
2020,
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5]
Ming Zhu,et al.
Attentive multi-stage convolutional neural network for crowd counting
,
2020,
Pattern Recognit. Lett..
[6]
Junping Zhang,et al.
PaDNet: Pan-Density Crowd Counting
,
2018,
IEEE Transactions on Image Processing.