Dual-NMS: A Method for Autonomously Removing False Detection Boxes from Aerial Image Object Detection Results

In the field of aerial image object detection based on deep learning, it’s difficult to extract features because the images are obtained from a top-down perspective. Therefore, there are numerous false detection boxes. The existing post-processing methods mainly remove overlapped detection boxes, but it’s hard to eliminate false detection boxes. The proposed dual non-maximum suppression (dual-NMS) combines the density of detection boxes that are generated for each detected object with the corresponding classification confidence to autonomously remove the false detection boxes. With the dual-NMS as a post-processing method, the precision is greatly improved under the premise of keeping recall unchanged. In vehicle detection in aerial imagery (VEDAI) and dataset for object detection in aerial images (DOTA) datasets, the removal rate of false detection boxes is over 50%. Additionally, according to the characteristics of aerial images, the correlation calculation layer for feature channel separation and the dilated convolution guidance structure are proposed to enhance the feature extraction ability of the network, and these structures constitute the correlation network (CorrNet). Compared with you only look once (YOLOv3), the mean average precision (mAP) of the CorrNet for DOTA increased by 9.78%. Commingled with dual-NMS, the detection effect in aerial images is significantly improved.

[1]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[2]  Xiao Xiang Zhu,et al.  A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[4]  Yuting Gao,et al.  Fused Text Segmentation Networks for Multi-oriented Scene Text Detection , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[5]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Xiangyu Zhang,et al.  Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection , 2018, ArXiv.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Gang Wang,et al.  Progressive Attention Guided Recurrent Network for Salient Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[11]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[12]  Yun Teng,et al.  CornerNet-Lite: Efficient Keypoint based Object Detection , 2019, BMVC.

[13]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Bernt Schiele,et al.  Learning Non-maximum Suppression , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Paolo Napoletano,et al.  Anomaly Detection in Nanofibrous Materials by CNN-Based Self-Similarity , 2018, Sensors.

[18]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Bernt Schiele,et al.  A Convnet for Non-maximum Suppression , 2015, GCPR.

[20]  Jianwei Li,et al.  A Lightweight Convolutional Neural Network Based on Visual Attention for SAR Image Target Classification , 2018, Sensors.

[21]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Qing Liu,et al.  Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Anelia Angelova,et al.  Depth From Videos in the Wild: Unsupervised Monocular Depth Learning From Unknown Cameras , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Lianwen Jin,et al.  Detecting Curve Text in the Wild: New Dataset and New Solution , 2017, ArXiv.

[27]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[28]  Yinghai Ke,et al.  Urban Land Use and Land Cover Classification Using Novel Deep Learning Models Based on High Spatial Resolution Satellite Imagery , 2018, Sensors.

[29]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[30]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[31]  Wei Li,et al.  R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection , 2017, ArXiv.

[32]  Xiangyu Zhang,et al.  Bounding Box Regression With Uncertainty for Accurate Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  In So Kweon,et al.  Convolutional Block Attention Module , 2018, ECCV 2018.

[34]  Lin Lei,et al.  Vehicle Detection in Aerial Images Based on Region Convolutional Neural Networks and Hard Negative Example Mining , 2017, Sensors.

[35]  Silvio Savarese,et al.  Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Stephen Lin,et al.  RepPoints: Point Set Representation for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[39]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Р Ю Чуйков,et al.  Обнаружение транспортных средств на изображениях загородных шоссе на основе метода Single shot multibox Detector , 2017 .