VisDrone-DET2021: The Vision Meets Drone Object detection Challenge Results

Object detection on the drone faces a great diversity of challenges such as small object inference, background clutter and wide viewpoint. In contrast to traditional detection problem in computer vision, object detection in bird-like angle can not be transplanted directly from common-in-use methods due to special object texture in sky‘s view. However, due to the lack of a comprehensive data set, the number of algorithms that focus on object detection using data captured by drones is limited. So the VisDrone team gathered a massive data set and organized Vision Meets Drones: A Challenge (VisDrone2021) in conjunction with the IEEE International Conference on Computer Vision (ICCV 2021) to advance the field. The collected dataset is the same as the previous dataset object detection challenge. Specifically, the team needed to predict the bounding boxes of the objects of ten predefined classes. We received results from a number of teams using different approaches, and this article describes the 8 team’s approach. We conducted a detailed analysis of the assessment results and summarized the challenges. More information can be found at: http://www.aiskyeye.com/.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Jie Xu,et al.  Multi-model ensemble with rich spatial information for object detection , 2020, Pattern Recognit..

[3]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[7]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[8]  Chen Chen,et al.  Towards Resolving the Challenge of Long-tail Distribution in UAV Images for Object Detection , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[9]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[10]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Roman Solovyev,et al.  Weighted boxes fusion: Ensembling boxes from different object detection models , 2021, Image Vis. Comput..

[13]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[14]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ramakant Nevatia,et al.  NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[17]  Yi Wang,et al.  VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[18]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Wei Zhang,et al.  VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[21]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[22]  Martial Hebert,et al.  Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Jian Zhao,et al.  Effective Fusion Factor in FPN for Tiny Object Detection , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Hao Chen,et al.  Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet , 2020, ArXiv.

[26]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[27]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.