Modeling Noisy Annotations for Crowd Counting

The annotation noise in crowd counting is not modeled in traditional crowd counting algorithms based on crowd density maps. In this paper, we first model the annotation noise using a random variable with Gaussian distribution, and derive the pdf of the crowd density value for each spatial location in the image. We then approximate the joint distribution of the density values (i.e., the distribution of density maps) with a full covariance multivariate Gaussian density, and derive a low-rank approximate for tractable implementation. We use our loss function to train a crowd density map estimator and achieve state-of-the-art performance on three large-scale crowd counting datasets, which confirms its effectiveness. Examination of the predictions of the trained model shows that it can correctly predict the locations of people in spite of the noisy training data, which demonstrates the robustness of our loss function to annotation noise.

[1]  Qi Wang,et al.  NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Vishal M. Patel,et al.  Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Vishal M. Patel,et al.  CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[4]  Antoni B. Chan,et al.  Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Counting, Detection, and Tracking , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Antoni B. Chan,et al.  Adaptive Density Map Generation for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Lior Wolf,et al.  Learning to Count with CNN Boosting , 2016, ECCV.

[7]  Shaogang Gong,et al.  Crowd Counting and Profiling: Methodology and Evaluation , 2013, Modeling, Simulation and Visual Analysis of Crowds.

[8]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[9]  Roberto Cipolla,et al.  Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[10]  Wei Wu,et al.  Adaptive Dilated Network With Self-Correction Supervision for Counting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  R. Venkatesh Babu,et al.  Top-Down Feedback for Crowd Counting Convolutional Neural Network , 2018, AAAI.

[12]  Fei Su,et al.  Scale Aggregation Network for Accurate and Efficient Crowd Counting , 2018, ECCV.

[13]  Hieu Le,et al.  Iterative Crowd Counting , 2018, ECCV.

[14]  Qijun Zhao,et al.  Point in, Box Out: Beyond Counting Persons in Crowds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  R. Venkatesh Babu,et al.  Locate, Size, and Count: Accurately Resolving People in Dense Crowds via Detection , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Liang Lin,et al.  Crowd Counting using Deep Recurrent Spatial-Aware Network , 2018, IJCAI.

[18]  Antoni B. Chan,et al.  Crowd Counting by Adaptively Fusing Predictions from an Image Pyramid , 2018, BMVC.

[19]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Dinesh Manocha,et al.  Modeling, Simulation and Visual Analysis of Crowds , 2013, The International Series in Video Computing.

[21]  R. Venkatesh Babu,et al.  Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Mark W. Schmidt,et al.  Where are the Blobs: Counting by Localization with Point Supervision , 2018, ECCV.

[24]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[25]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[26]  Yihong Gong,et al.  Bayesian Loss for Crowd Count Estimation With Point Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Vishal M. Patel,et al.  JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method , 2020, IEEE transactions on pattern analysis and machine intelligence.

[28]  Tieniu Tan,et al.  Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection , 2008, 2008 19th International Conference on Pattern Recognition.

[29]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[31]  Baoyuan Wu,et al.  Residual Regression With Semantic Prior for Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Antoni B. Chan,et al.  Small instance detection by integer programming on object density maps , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Wei Lin,et al.  Learning From Synthetic Data for Crowd Counting in the Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Guanbin Li,et al.  Crowd Counting With Deep Structured Scale Integration Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Vishal M. Patel,et al.  Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Guoyan Zheng,et al.  Crowd Counting with Deep Negative Correlation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Yichen Wei,et al.  Data Uncertainty Learning in Face Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Bingbing Ni,et al.  Crowd Counting via Adversarial Cross-Scale Consistency Pursuit , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Shaogang Gong,et al.  From Semi-supervised to Transfer Counting of Crowds , 2013, 2013 IEEE International Conference on Computer Vision.

[43]  Nuno Vasconcelos,et al.  Bayesian Poisson regression for crowd counting , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[44]  Dit-Yan Yeung,et al.  Spatiotemporal Modeling for Crowd Counting in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Joost van de Weijer,et al.  Leveraging Unlabeled Data for Crowd Counting by Learning to Rank , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.