Boosting R-CNN: Reweighting R-CNN Samples by RPN's Error for Underwater Object Detection

Complicated underwater environments bring new challenges to object detection, such as unbalanced light conditions, low contrast, occlusion, and mimicry of aquatic organisms. Under these circumstances, the objects captured by the underwater camera will become vague, and the generic detectors often fail on these vague objects. This work aims to solve the problem from two perspectives: uncertainty modeling and hard example mining. We propose a two-stage underwater detector named boosting R-CNN, which comprises three key components. First, a new region proposal network named RetinaRPN is proposed, which provides high-quality proposals and considers objectness and IoU prediction for uncertainty to model the object prior probability. Second, the probabilistic inference pipeline is introduced to combine the first-stage prior uncertainty and the second-stage classification score to model the final detection score. Finally, we propose a new hard example mining method named boosting reweighting. Specifically, when the region proposal network miscalculates the object prior probability for a sample, boosting reweighting will increase the classification loss of the sample in the R-CNN head during training, while reducing the loss of easy samples with accurately estimated priors. Thus, a robust detection head in the second stage can be obtained. During the inference stage, the R-CNN has the capability to rectify the error of the first stage to improve the performance. Comprehensive experiments on two underwater datasets and two generic object detection datasets demonstrate the effectiveness and robustness of our method.

[1]  Pinhao Song,et al.  Excavating RoI Attention for Underwater Object Detection , 2022, 2022 IEEE International Conference on Image Processing (ICIP).

[2]  Sparsh Mittal,et al.  A Survey of Deep Learning Techniques for Underwater Image Classification , 2022, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Karen Panetta,et al.  Comprehensive Underwater Object Tracking Benchmark Dataset and Underwater Image Enhancement With GAN , 2022, IEEE Journal of Oceanic Engineering.

[4]  T. Tan,et al.  Focal and Efficient IOU Loss for Accurate Bounding Box Regression , 2021, Neurocomputing.

[5]  Qixiang Ye,et al.  FreeAnchor: Learning to Match Anchors for Visual Object Detection , 2019, NeurIPS.

[6]  Xudong Sun,et al.  Composited FishNet: Fish Detection and Species Recognition From Low-Quality Underwater Videos , 2021, IEEE Transactions on Image Processing.

[7]  Zeming Li,et al.  OTA: Optimal Transport Assignment for Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Philipp Krähenbühl,et al.  Probabilistic two-stage detection , 2021, ArXiv.

[9]  Jun Li,et al.  Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Yi Jiang,et al.  Sparse R-CNN: End-to-End Object Detection with Learnable Proposals , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Bin Li,et al.  Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.

[12]  Ying Wang,et al.  VarifocalNet: An IoU-aware Dense Object Detector , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  A. Yuille,et al.  DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  A. Prati,et al.  A Novel Region of Interest Extraction Layer for Instance Segmentation , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[15]  Xin Wang,et al.  SWIPENET: Object detection in noisy underwater images , 2020, ArXiv.

[16]  Qingming Huang,et al.  Corner Proposal Network for Anchor-free, Two-stage Object Detection , 2020, ECCV.

[17]  Jian Sun,et al.  BorderDet: Border Feature for Dense Object Detection , 2020, ECCV.

[18]  Hee Seok Lee,et al.  Probabilistic Anchor Assignment with IoU Prediction for Object Detection , 2020, ECCV.

[19]  Zheng Zhang,et al.  RepPoints V2: Verification Meets Regression for Object Detection , 2020, NeurIPS.

[20]  Jian Sun,et al.  AutoAssign: Differentiable Label Assignment for Dense Object Detection , 2020, ArXiv.

[21]  Jun Li,et al.  Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection , 2020, NeurIPS.

[22]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[23]  Zhen Zhang,et al.  Underwater salient object detection by combining 2D and 3D visual features , 2020, Neurocomputing.

[24]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[25]  Xilin Chen,et al.  Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training , 2020, ECCV.

[26]  Kai Chen,et al.  Feature Pyramid Grids , 2020, ArXiv.

[27]  Fei Wang,et al.  CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Xing Liu,et al.  UDD: An Underwater Open-sea Farm Object Detection Dataset for Underwater Robot Picking , 2020, ArXiv.

[29]  Kai Chen,et al.  Side-Aware Boundary Localization for More Precise Object Detection , 2019, ECCV.

[30]  Shifeng Zhang,et al.  Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Thomas H. Li,et al.  ROIMIX: Proposal-Fusion Among Multiple Images for Underwater Object Detection , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Y. Fu,et al.  Rethinking Classification and Localization for Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Kai Chen,et al.  Prime Sample Attention in Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Wei Chen,et al.  Dual Refinement Underwater Object Detection Network , 2020, ECCV.

[36]  C. Yoo,et al.  Cascade RPN: Delving into High-Quality Region Proposal Network with Adaptive Convolution , 2019, NeurIPS.

[37]  Zhaoxiang Zhang,et al.  Revisiting Feature Alignment for One-stage Object Detection , 2019, ArXiv.

[38]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[39]  Thomas B. Moeslund,et al.  Detection of Marine Animals in a New Underwater Dataset with Varying Visibility , 2019, CVPR Workshops.

[40]  Stephen Lin,et al.  RepPoints: Point Set Representation for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[42]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Hao Zhou,et al.  Faster R-CNN for marine organisms detection and recognition using data augmentation , 2019, Neurocomputing.

[46]  Marios Savvides,et al.  Feature Selective Anchor-Free Module for Single-Shot Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Zhaoxiang Zhang,et al.  Scale-Aware Trident Networks for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Junjie Yan,et al.  Grid R-CNN , 2018, 1811.12030.

[49]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, International Journal of Computer Vision.

[50]  Mun-Cheon Kang,et al.  Parallel Feature Pyramid Network for Object Detection , 2018, ECCV.

[51]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Shifeng Zhang,et al.  Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Fuchun Sun,et al.  RON: Reverse Connection with Objectness Prior Networks for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Wei Liu,et al.  DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[56]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[58]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[60]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Nikos Komodakis,et al.  Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[63]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[64]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[65]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.