Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Bounding box (bbox) regression is a fundamental task in computer vision. So far, the most commonly used loss functions for bbox regression are the Intersection over Union (IoU) loss and its variants. In this paper, we generalize existing IoUbased losses to a new family of power IoU losses that have a power IoU term and an additional power regularization term with a single power parameter α. We call this new family of losses the α-IoU losses and analyze properties such as order preservingness and loss/gradient reweighting. Experiments on multiple object detection benchmarks and models demonstrate that α-IoU losses, 1) can surpass existing IoU-based losses by a noticeable performance margin; 2) offer detectors more flexibility in achieving different levels of bbox regression accuracy by modulating α; and 3) are more robust to small datasets and noisy bboxes.

[1]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[2]  Larry S. Davis,et al.  Learning From Noisy Anchors for One-Stage Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Guanglu Song,et al.  Revisiting the Sibling Head in Object Detector , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[5]  Silvio Savarese,et al.  Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yuning Jiang,et al.  UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[7]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[9]  Shifeng Zhang,et al.  Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  James Bailey,et al.  Dimensionality-Driven Learning with Noisy Labels , 2018, ICML.

[11]  Junnan Li,et al.  Towards Noise-resistant Object Detection with Noisy Annotations , 2020, ArXiv.

[12]  Lars Petersson,et al.  Improving Object Localization with Fitness NMS and Bounded IoU Loss , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Kai Chen,et al.  Hybrid Task Cascade for Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Peidong Liu,et al.  Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search , 2021, ICLR.

[17]  Fei Wang,et al.  CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yang Wang,et al.  Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation , 2016, ISVC.

[20]  Lei Zhang,et al.  Single-Shot Two-Pronged Detector with Rectified IoU Loss , 2020, ACM Multimedia.

[21]  P. Alam ‘S’ , 2021, Composites Engineering: An A–Z Guide.

[22]  James Bailey,et al.  Symmetric Cross Entropy for Robust Learning With Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[24]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[25]  Kotagiri Ramamohanarao,et al.  Learning Non-Unique Segmentation with Reward-Penalty Dice Loss , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[26]  Xingyi Zhou,et al.  Bottom-Up Object Detection by Grouping Extreme and Center Points , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[28]  James Bailey,et al.  Normalized Loss Functions for Deep Learning with Noisy Labels , 2020, ICML.

[29]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[30]  Ramakant Nevatia,et al.  NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[32]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[33]  Yi-Fan Zhang,et al.  Focal and Efficient IOU Loss for Accurate Bounding Box Regression , 2021, ArXiv.

[34]  Xizhou Zhu,et al.  AutoLoss-Zero: Searching Loss Functions from Scratch for Generic Tasks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Zhaohui Zheng,et al.  Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression , 2019, AAAI.

[36]  Bin Li,et al.  Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.

[37]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[38]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Arash Vahdat,et al.  A Robust Learning Approach to Domain Adaptive Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Dacheng Tao,et al.  Deep Streaming Label Learning , 2020, ICML.

[41]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[42]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[43]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[44]  Hao Li,et al.  Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation , 2021, ICLR.

[45]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[46]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.