Background Noise Filtering and Distribution Dividing for Crowd Counting

Crowd counting is a challenging problem due to the diverse crowd distribution and background interference. In this paper, we propose a new approach for head size estimation to reduce the impact of different crowd scale and background noise. Different from just using local information of distance between human heads, the global information of the people distribution in the whole image is also under consideration. We obey the order of far- to near-region (small to large) to spread head size, and ensure that the propagation is uninterrupted by inserting dummy head points. The estimated head size is further exploited, such as dividing the crowd into parts of different densities and generating a high-fidelity head mask. On the other hand, we design three different head mask usage mechanisms and the corresponding head masks to analyze where and which mask could lead to better background filtering. Based on the learned masks, two competitive models are proposed which can perform robust crowd estimation against background noise and diverse crowd scale. We evaluate the proposed method on three public crowd counting datasets of ShanghaiTech, UCFQNRF and UCFCC_50. Experimental results demonstrate that the proposed algorithm performs favorably against the state-of-the-art crowd counting approaches.

[1]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[2]  Bingbing Ni,et al.  Crowd Counting via Adversarial Cross-Scale Consistency Pursuit , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Changxin Gao,et al.  Scale Pyramid Network for Crowd Counting , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6]  Shenghua Gao,et al.  Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wei Lin,et al.  Learning From Synthetic Data for Crowd Counting in the Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Antoni B. Chan,et al.  Adaptive Density Map Generation for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Alberto Del Bimbo,et al.  Real-time people counting from depth imagery of crowded environments , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[17]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[18]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Deyu Meng,et al.  DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Li Pan,et al.  ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Ling Shao,et al.  Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  R. Venkatesh Babu,et al.  Top-Down Feedback for Crowd Counting Convolutional Neural Network , 2018, AAAI.

[25]  Yu Qiao,et al.  Depth driven people counting using deep region proposal network , 2017, 2017 IEEE International Conference on Information and Automation (ICIA).

[26]  Adrien Descamps,et al.  Counting People in the Crowd Using a Generic Head Detector , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[27]  Sheng-Fuu Lin,et al.  Estimation of number of people in crowded scenes using perspective transformation , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[28]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Wangmeng Zuo,et al.  Perspective-Guided Convolution Networks for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Pascal Fua,et al.  Context-Aware Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Rongrong Ji,et al.  Body Structure Aware Deep Crowd Counting , 2018, IEEE Transactions on Image Processing.

[32]  Ming-Ming Cheng,et al.  Nonlinear Regression via Deep Negative Correlation Learning , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Qijun Zhao,et al.  Point in, Box Out: Beyond Counting Persons in Crowds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Liang He,et al.  Adaptive Scenario Discovery for Crowd Counting , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Gabriel J. Brostow,et al.  Digging Into Self-Supervised Monocular Depth Estimation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[37]  Aditya Krishna Menon,et al.  Large-Scale Support Vector Machines: Algorithms and Theory , 2009 .

[38]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[39]  Qijun Chen,et al.  Revisiting Perspective Information for Efficient Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[42]  Noel E. O'Connor,et al.  ResnetCrowd: A residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[43]  Takio Kurita,et al.  Mixture of Counting CNNs: Adaptive Integration of CNNs Specialized to Specific Appearance for Crowd Counting , 2017, ArXiv.

[44]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Ramakant Nevatia,et al.  Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors , 2007, International Journal of Computer Vision.

[46]  Hao Tang,et al.  Improving Dense Crowd Counting Convolutional Neural Networks using Inverse k-Nearest Neighbor Maps and Multiscale Upsampling , 2019, VISIGRAPP.

[47]  Chao Lu,et al.  Dual Path Multi-Scale Fusion Networks with Attention for Crowd Counting , 2019, ArXiv.

[48]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Vishal M. Patel,et al.  Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[50]  Vishal M. Patel,et al.  CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[51]  Bing Zhou,et al.  Depth Information Guided Crowd Counting for Complex Crowd Scenes , 2018, Pattern Recognit. Lett..

[52]  Huadong Ma,et al.  Real-time accurate crowd counting based on RGB-D information , 2012, 2012 19th IEEE International Conference on Image Processing.

[53]  Fei Su,et al.  Scale Aggregation Network for Accurate and Efficient Crowd Counting , 2018, ECCV.