Uniformity in Heterogeneity: Diving Deep into Count Interval Partition for Crowd Counting

Recently, the problem of inaccurate learning targets in crowd counting draws increasing attention. Inspired by a few pioneering work, we solve this problem by trying to predict the indices of pre-defined interval bins of counts instead of the count values themselves. However, an inappropriate interval setting might make the count error contributions from different intervals extremely imbalanced, leading to inferior counting performance. Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk. Then to mitigate the inevitably introduced discretization errors in the count quantization process, we propose another criterion called Mean Count Proxies (MCP). The MCP criterion selects the best count proxy for each interval to represent its count value during inference, making the overall expected discretization error of an image nearly negligible. As far as we are aware, this work is the first to delve into such a classification task and ends up with a promising solution for count interval partition. Following the above two theoretically demonstrated criterions, we propose a simple yet effective model termed Uniform Error Partition Network (UEPNet), which achieves state-of-the-art performance on several challenging datasets. The codes will be available at: TencentYoutuResearch/CrowdCounting-UEPNet.

[1]  Peder A. Olsen,et al.  Crowd Counting with Decomposed Uncertainty , 2019, AAAI.

[2]  Yihong Gong,et al.  Bayesian Loss for Crowd Count Estimation With Point Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Ling Shao,et al.  Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[5]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[6]  Hao Lu,et al.  From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Alexander Hauptmann,et al.  Learning Spatial Awareness to Improve Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Xiyang Liu,et al.  Adaptive Mixture Regression Network with Local Counting Map for Crowd Counting , 2020, ECCV.

[12]  Wangmeng Zuo,et al.  Perspective-Guided Convolution Networks for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Guiguang Ding,et al.  Shallow Feature Based Dense Attention Network for Crowd Counting , 2020, AAAI.

[15]  Hao Lu,et al.  Weighing Counts: Sequential Crowd Counting by Reinforcement Learning , 2020, ECCV.

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Liang Liu,et al.  Counting Objects by Blockwise Classification , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Hengcan Shi,et al.  Offset Bin Classification Network for Accurate Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Hao Lu,et al.  TasselNetv2: in-field counting of wheat spikes with context-augmented local regression networks , 2019, Plant Methods.

[20]  Dacheng Tao,et al.  Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Antoni B. Chan,et al.  Adaptive Density Map Generation for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Guanbin Li,et al.  Crowd Counting With Deep Structured Scale Integration Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Wei Wu,et al.  Adaptive Dilated Network With Self-Correction Supervision for Counting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Hao Tang,et al.  Improving Dense Crowd Counting Convolutional Neural Networks using Inverse k-Nearest Neighbor Maps and Multiscale Upsampling , 2019, VISIGRAPP.

[25]  Pei Lv,et al.  Attention Scaling for Crowd Counting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Guangshuai Gao,et al.  CNN-based Density Estimation and Crowd Counting: A Survey , 2020, ArXiv.