Crowd Counting with Decomposed Uncertainty

Research in neural networks in the field of computer vision has achieved remarkable accuracy for point estimation. However, the uncertainty in the estimation is rarely addressed. Uncertainty quantification accompanied by point estimation can lead to a more informed decision, and even improve the prediction quality. In this work, we focus on uncertainty estimation in the domain of crowd counting. With increasing occurrences of heavily crowded events such as political rallies, protests, concerts, etc., automated crowd analysis is becoming an increasingly crucial task. The stakes can be very high in many of these real-world applications. We propose a scalable neural network framework with quantification of decomposed uncertainty using a bootstrap ensemble. We demonstrate that the proposed uncertainty quantification method provides additional insight to the crowd counting problem and is simple to implement. We also show that our proposed method exhibits the state of the art performances in many benchmark crowd counting datasets.

[1]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[2]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[3]  Lior Wolf,et al.  Learning to Count with CNN Boosting , 2016, ECCV.

[4]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Karthikeyan Natesan Ramamurthy,et al.  Counting and Segmenting Sorghum Heads , 2019, ArXiv.

[6]  Noel E. O'Connor,et al.  Fully Convolutional Crowd Counting on Highly Congested Scenes , 2016, VISIGRAPP.

[7]  Andrew Zisserman,et al.  Interactive Object Counting , 2014, ECCV.

[8]  Jordi Vitrià,et al.  Learning to count with deep object features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Joost van de Weijer,et al.  Leveraging Unlabeled Data for Crowd Counting by Learning to Rank , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Nuno Vasconcelos,et al.  Bayesian Poisson regression for crowd counting , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[12]  Vishal M. Patel,et al.  CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[13]  Benjamin Van Roy,et al.  Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[14]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Fei Su,et al.  Scale Aggregation Network for Accurate and Efficient Crowd Counting , 2018, ECCV.

[18]  Daniel Oñoro-Rubio,et al.  Towards Perspective-Free Object Counting with Deep Learning , 2016, ECCV.

[19]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Pascal Fua,et al.  Context-Aware Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Sridha Sridharan,et al.  Crowd Counting Using Multiple Local Features , 2009, 2009 Digital Image Computing: Techniques and Applications.

[22]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[23]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[24]  Radford M. Neal Bayesian Learning via Stochastic Dynamics , 1992, NIPS.

[25]  Takio Kurita,et al.  Mixture of Counting CNNs: Adaptive Integration of CNNs Specialized to Specific Appearance for Crowd Counting , 2017, ArXiv.

[26]  Shaogang Gong,et al.  Crowd Counting and Profiling: Methodology and Evaluation , 2013, Modeling, Simulation and Visual Analysis of Crowds.

[27]  Hieu Le,et al.  Iterative Crowd Counting , 2018, ECCV.

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Vishal M. Patel,et al.  A Survey of Recent Advances in CNN-based Single Image Crowd Counting and Density Estimation , 2017, Pattern Recognit. Lett..

[30]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  A. Weigend,et al.  Estimating the mean and variance of the target probability distribution , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[32]  Max Welling,et al.  Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors , 2016, ICML.

[33]  Vishal M. Patel,et al.  Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Alex Graves,et al.  Practical Variational Inference for Neural Networks , 2011, NIPS.

[35]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Lu Wang,et al.  Crowd counting and segmentation in visual surveillance , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[37]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[38]  Guoyan Zheng,et al.  Crowd Counting with Deep Negative Correlation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Alexander J. Smola,et al.  Heteroscedastic Gaussian process regression , 2005, ICML.

[40]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Shaogang Gong,et al.  Feature Mining for Localised Crowd Counting , 2012, BMVC.

[45]  Andrew Zisserman,et al.  Microscopy cell counting and detection with fully convolutional regression networks , 2018, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[46]  José M. F. Moura,et al.  FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Min-hwan Oh,et al.  Sequential Anomaly Detection using Inverse Reinforcement Learning , 2019, KDD.

[48]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[49]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[50]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[52]  Tieniu Tan,et al.  Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection , 2008, 2008 19th International Conference on Pattern Recognition.

[53]  Ahmad Kamal Nasir,et al.  Precision Forestry: Trees Counting in Urban Areas Using Visible Imagery based on an Unmanned Aerial Vehicle , 2016 .

[54]  Haroon Idrees,et al.  Detecting Humans in Dense Crowds Using Locally-Consistent Scale Prior and Global Occlusion Reasoning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[57]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[59]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[60]  Stefano Ermon,et al.  Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.

[61]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[62]  Mao Ye,et al.  Fast crowd density estimation with convolutional neural networks , 2015, Eng. Appl. Artif. Intell..

[63]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Shin-ichi Maeda,et al.  A Bayesian encourages dropout , 2014, ArXiv.

[65]  Haizhou Ai,et al.  End-to-end crowd counting via joint learning local and global count , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[66]  Max Welling,et al.  Multiplicative Normalizing Flows for Variational Bayesian Neural Networks , 2017, ICML.

[67]  D. Freedman,et al.  Some Asymptotic Theory for the Bootstrap , 1981 .

[68]  Wei Lin,et al.  Learning From Synthetic Data for Crowd Counting in the Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Xiaochun Cao,et al.  Deep People Counting in Extremely Dense Crowds , 2015, ACM Multimedia.

[70]  Ryuzo Okada,et al.  COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[71]  Liang Lin,et al.  Crowd Counting using Deep Recurrent Spatial-Aware Network , 2018, IJCAI.