Adaptive Dilated Network With Self-Correction Supervision for Counting

The counting problem aims to estimate the number of objects in images. Due to large scale variation and labeling deviations, it remains a challenging task. The static density map supervised learning framework is widely used in existing methods, which uses the Gaussian kernel to generate a density map as the learning target and utilizes the Euclidean distance to optimize the model. However, the framework is intolerable to the labeling deviations and can not reflect the scale variation. In this paper, we propose an adaptive dilated convolution and a novel supervised learning framework named self-correction (SC) supervision. In the supervision level, the SC supervision utilizes the outputs of the model to iteratively correct the annotations and employs the SC loss to simultaneously optimize the model from both the whole and the individuals. In the feature level, the proposed adaptive dilated convolution predicts a continuous value as the specific dilation rate for each location, which adapts the scale variation better than a discrete and static dilation rate. Extensive experiments illustrate that our approach has achieved a consistent improvement on four challenging benchmarks. Especially, our approach achieves better performance than the state-of-the-art methods on all benchmark datasets.

[1]  Saturnino Maldonado-Bascón,et al.  Extremely Overlapping Vehicle Counting , 2015, IbPRIA.

[2]  Guoyan Zheng,et al.  Crowd Counting with Deep Negative Correlation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Qijun Chen,et al.  Revisiting Perspective Information for Efficient Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Cees Snoek,et al.  Counting With Focus for Free , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Vishal M. Patel,et al.  CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[6]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Liang Lin,et al.  Crowd Counting using Deep Recurrent Spatial-Aware Network , 2018, IJCAI.

[10]  Wei Wu,et al.  Hierarchical Feature Embedding for Attribute Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Hieu Le,et al.  Iterative Crowd Counting , 2018, ECCV.

[12]  Sridha Sridharan,et al.  Crowd Counting Using Multiple Local Features , 2009, 2009 Digital Image Computing: Techniques and Applications.

[13]  Vishal M. Patel,et al.  Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Ramprasaath R. Selvaraju,et al.  Counting Everyday Objects in Everyday Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Fei Su,et al.  Scale Aggregation Network for Accurate and Efficient Crowd Counting , 2018, ECCV.

[17]  Tieniu Tan,et al.  Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection , 2008, 2008 19th International Conference on Pattern Recognition.

[18]  Yihong Gong,et al.  Bayesian Loss for Crowd Count Estimation With Point Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Alexander Hauptmann,et al.  Learning Spatial Awareness to Improve Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Meng Wang,et al.  DADNet: Dilated-Attention-Deformable ConvNet for Crowd Counting , 2019, ACM Multimedia.

[21]  Li Pan,et al.  ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Pietro Perona,et al.  Multiple Component Learning for Object Detection , 2008, ECCV.

[23]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[24]  Yuning Jiang,et al.  Repulsion Loss: Detecting Pedestrians in a Crowd , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[26]  Ramakant Nevatia,et al.  Bayesian human segmentation in crowded situations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[27]  Robert T. Collins,et al.  Marked point processes for crowd counting , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Shaogang Gong,et al.  Cumulative Attribute Space for Age and Crowd Density Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[30]  Antoni B. Chan,et al.  Crowd Counting by Adaptively Fusing Predictions from an Image Pyramid , 2018, BMVC.

[31]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Yang Wang,et al.  Crowd Counting Using Scale-Aware Attention Networks , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[33]  Chongyang Zhang,et al.  Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Guanbin Li,et al.  Crowd Counting With Deep Structured Scale Integration Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Sheng-Fuu Lin,et al.  Estimation of number of people in crowded scenes using perspective transformation , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[36]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Wangmeng Zuo,et al.  Perspective-Guided Convolution Networks for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Lior Wolf,et al.  Learning to Count with CNN Boosting , 2016, ECCV.

[40]  Hao Lu,et al.  From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Roberto Cipolla,et al.  Unsupervised Bayesian Detection of Independent Motion in Crowds , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  Antoni B. Chan,et al.  Adaptive Density Map Generation for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Joost van de Weijer,et al.  Leveraging Unlabeled Data for Crowd Counting by Learning to Rank , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Daniel Oñoro-Rubio,et al.  Towards Perspective-Free Object Counting with Deep Learning , 2016, ECCV.

[46]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[47]  Lu Zhang,et al.  Crowd Counting via Scale-Adaptive Convolutional Neural Network , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[48]  Jonathan Ventura,et al.  An Aggregated Multicolumn Dilated Convolution Network for Perspective-Free Counting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[49]  Bingbing Ni,et al.  Crowd Counting via Adversarial Cross-Scale Consistency Pursuit , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Changxin Gao,et al.  Scale Pyramid Network for Crowd Counting , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).