Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation

When deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a high number of false positives is not penalized by standard evaluation metrics. However, in scenarios of Artificial Intelligence (AI) applications that require high confidence scores (e.g., due to legal requirements or consequences of incorrect detections are severe) or a certain level of model robustness is required, it is unclear which base model to use since they were mainly optimized for benchmark scores. In this paper, we propose a method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[3]  Bin Wang,et al.  Building Detection from VHR Remote Sensing Imagery Based on the Morphological Building Index , 2018, Remote. Sens..

[4]  Alexander C. Berg,et al.  A Mask-RCNN Baseline for Probabilistic Object Detection , 2019, ArXiv.

[5]  Ruigang Yang,et al.  The ApolloScape Dataset for Autonomous Driving , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Ling Shao,et al.  iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images , 2019, CVPR Workshops.

[7]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[8]  Zheng Zhang,et al.  Loss Functions for Multiset Prediction , 2017, NeurIPS.

[9]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yiping Yang,et al.  A High Resolution Optical Satellite Image Dataset for Ship Recognition and Some New Baselines , 2017, ICPRAM.

[11]  Jian Sun,et al.  Objects365: A Large-Scale, High-Quality Dataset for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Wanqiu Liu,et al.  Remote sensing for bridge health monitoring , 2009, Optical Engineering + Applications.

[13]  Alan Yuille,et al.  PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning , 2020, ECCV.

[14]  Gustavo Carneiro,et al.  Probabilistic Object Detection: Definition and Evaluation , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[15]  Vivek Kothari,et al.  The Final Frontier: Deep Learning in Space , 2020, HotMobile.

[16]  Sergio L. Netto,et al.  A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit , 2021, Electronics.

[17]  Wolfram Burgard,et al.  The limits and potentials of deep learning for robotics , 2018, Int. J. Robotics Res..

[18]  Richard Zhang,et al.  Making Convolutional Networks Shift-Invariant Again , 2019, ICML.

[19]  Gang Wan,et al.  Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark , 2020, ISPRS Journal of Photogrammetry and Remote Sensing.

[20]  Zhaohui Zheng,et al.  Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression , 2019, AAAI.

[21]  Jongyoul Park,et al.  An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Lawrence L. Sutter,et al.  Bridge condition assessment using remote sensors. , 2013 .

[23]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[24]  Anqi Xu,et al.  Physical Adversarial Textures That Fool Visual Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Yong Jae Lee,et al.  YOLACT: Real-Time Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[27]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[28]  Jun Li,et al.  Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection , 2020, NeurIPS.

[29]  Naman K. Gupta,et al.  ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements , 2020 .

[30]  Xiaofeng Wang,et al.  TTPLA: An Aerial-Image Dataset for Detection and Segmentation of Transmission Towers and Power Lines , 2020, ACCV.

[31]  Andrzej Stateczny,et al.  Cascade Object Detection and Remote Sensing Object Detection Method Based on Trainable Activation Function , 2021, Remote. Sens..

[32]  Jin Wang,et al.  A Cascaded R-CNN With Multiscale Attention and Imbalanced Samples for Traffic Sign Detection , 2020, IEEE Access.

[33]  Chien-Yao Wang,et al.  Scaled-YOLOv4: Scaling Cross Stage Partial Network , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[36]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Junwei Han,et al.  A Survey on Object Detection in Optical Remote Sensing Images , 2016, ArXiv.

[38]  Matti Pietikäinen,et al.  Deep Learning for Generic Object Detection: A Survey , 2018, International Journal of Computer Vision.

[39]  L. Davis,et al.  Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors , 2019, ECCV.

[40]  Dragomir Anguelov,et al.  Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Xinggao Liu,et al.  The Dilemma of TriHard Loss and an Element-Weighted TriHard Loss for Person Re-Identification , 2020, NeurIPS.

[43]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[44]  Yohannes Kassahun,et al.  A2D2: Audi Autonomous Driving Dataset , 2020, ArXiv.

[45]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[46]  Jongyoul Park,et al.  CenterMask: Real-Time Anchor-Free Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[49]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[51]  Yaroslav Bulatov,et al.  xView: Objects in Context in Overhead Imagery , 2018, ArXiv.