VisDrone-CC2021: The Vision Meets Drone Crowd Counting Challenge Results

Crowding counting research evolves quickly by the lever-age of development in deep learning. Many researchers put their efforts into crowd counting tasks and have achieved many significant improvements. However, current datasets still barely satisfy this evolution and high quality evaluation data is urgent. Motivated by high quality and quantity study in crowding counting, we collect a drone-captured dataset formed by 5,468 images(images in RGB and thermal appear in pairs and 2,734 respectively). There are 1,807 pairs of images for training, and 927 pairs for testing. We manually annotate persons with points in each frame. Based on this dataset, we organized the Vision Meets Drone Crowd Counting Challenge(Visdrone-CC2021) in conjunction with the International Conference on Computer Vision (ICCV 2021). Our challenge attracts many researchers to join, which pave the road of speed up the milestone in crowding counting. To summarize the competition, we select the most remarkable algorithms from participants’ sub-missions and provide a detailed analysis of the evaluation results. More information can be found at the website: http://www.aiskyeye.com/.

[1]  Daiqin Yang,et al.  Drone-Based Car Counting via Density Map Learning , 2020, 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP).

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Huchuan Lu,et al.  Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection , 2020, ECCV.

[4]  Guangshuai Gao,et al.  CNN-based Density Estimation and Crowd Counting: A Survey , 2020, ArXiv.

[5]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Guoyan Zheng,et al.  Crowd Counting with Deep Negative Correlation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Hefeng Wu,et al.  Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Shaogang Gong,et al.  Feature Mining for Localised Crowd Counting , 2012, BMVC.

[11]  Boyu Wang,et al.  Distribution Matching for Crowd Counting , 2020, NeurIPS.

[12]  Liang Liu,et al.  Counting Objects by Blockwise Classification , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Yihong Gong,et al.  Direct Measure Matching for Crowd Counting , 2021, IJCAI.

[14]  Xiangjian He,et al.  Counting People Based on Linear, Weighted, and Local Random Forests , 2017, 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[15]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Saturnino Maldonado-Bascón,et al.  Extremely Overlapping Vehicle Counting , 2015, IbPRIA.

[17]  Qi Wang,et al.  NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Winston H. Hsu,et al.  Drone-Based Object Counting by Spatially Regularized Regional Proposal Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Fei Su,et al.  Scale Aggregation Network for Accurate and Efficient Crowd Counting , 2018, ECCV.

[21]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[22]  Mao Ye,et al.  Fast crowd density estimation with convolutional neural networks , 2015, Eng. Appl. Artif. Intell..

[23]  Huicheng Zheng,et al.  Cross-Line Pedestrian Counting Based on Spatially-Consistent Two-Stage Local Crowd Density Estimation and Accumulation , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Chen Wei,et al.  Deep Retinex Decomposition for Low-Light Enhancement , 2018, BMVC.

[25]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Tao Peng,et al.  RGB-T Crowd Counting from Drone: A Benchmark and MMCCN Network , 2020 .

[27]  Yihong Gong,et al.  Bayesian Loss for Crowd Count Estimation With Point Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Peter Reinartz,et al.  MRCNet: Crowd Counting and Density Map Estimation in Aerial and Ground Imagery , 2019, ArXiv.

[29]  Hieu Le,et al.  Iterative Crowd Counting , 2018, ECCV.

[30]  Qilong Wang,et al.  Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network , 2019, ArXiv.

[31]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.