Composited FishNet: Fish Detection and Species Recognition From Low-Quality Underwater Videos

The automatic detection and identification of fish from underwater videos is of great significance for fishery resource assessment and ecological environment monitoring. However, due to the poor quality of underwater images and unconstrained fish movement, traditional hand-designed feature extraction methods or convolutional neural network (CNN)-based object detection algorithms cannot meet the detection requirements in real underwater scenes. Therefore, to realize fish recognition and localization in a complex underwater environment, this paper proposes a novel composite fish detection framework based on a composite backbone and an enhanced path aggregation network called Composited FishNet. By improving the residual network (ResNet), a new composite backbone network (CBresnet) is designed to learn the scene change information (source domain style), which is caused by the differences in the image brightness, fish orientation, seabed structure, aquatic plant movement, fish species shape and texture differences. Thus, the interference of underwater environmental information on the object characteristics is reduced, and the output of the main network to the object information is strengthened. In addition, to better integrate the high and low feature information output from CBresnet, the enhanced path aggregation network (EPANet) is also designed to solve the insufficient utilization of semantic information caused by linear upsampling. The experimental results show that the average precision (AP)0.5:0.95, AP50 and average recall (AR)max=10 of the proposed Composited FishNet are 75.2%, 92.8% and 81.1%, respectively. The composite backbone network enhances the characteristic information output of the detected object and improves the utilization of characteristic information. This method can be used for fish detection and identification in complex underwater environments such as oceans and aquaculture.

[1]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[2]  Xiangyu Zhang,et al.  DetNet: A Backbone network for Object Detection , 2018, ArXiv.

[3]  Zhi Tang,et al.  CBNet: A Novel Composite Backbone Network Architecture for Object Detection , 2019, AAAI.

[4]  Fahad Shahbaz Khan,et al.  D2Det: Towards High Quality Object Detection and Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Khawar Khurshid,et al.  Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system , 2019, ICES Journal of Marine Science.

[6]  Wenwei Xu,et al.  Underwater Fish Detection Using Deep Learning for Water Power Applications , 2018, 2018 International Conference on Computational Science and Computational Intelligence (CSCI).

[7]  Hong Liu,et al.  WQT and DG-YOLO: towards domain generalization in underwater object detection , 2020, ArXiv.

[9]  Yuning Jiang,et al.  UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[10]  Son-Cheol Yu,et al.  Vision based real-time fish detection using convolutional neural network , 2017, OCEANS 2017 - Aberdeen.

[11]  Chongruo Wu,et al.  ResNeSt: Split-Attention Networks , 2020, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Fen Fang,et al.  Combining Faster R-CNN and Model-Driven Clustering for Elongated Object Detection , 2020, IEEE Transactions on Image Processing.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alfonso B. Labao,et al.  Cascaded deep network systems with linked ensemble components for underwater fish detection in the wild , 2019, Ecol. Informatics.

[15]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[17]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Monika Mathur,et al.  Crosspooled FishNet: transfer learning based fish species classification model , 2020, Multimedia Tools and Applications.

[19]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[21]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[22]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[23]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Abdesslam Benzinou,et al.  Underwater Live Fish Recognition by Deep Learning , 2018, ICISP.

[26]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Jinfeng Gong,et al.  TJU-DHD: A Diverse High-Resolution Dataset for Object Detection , 2020, IEEE Transactions on Image Processing.

[29]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[30]  George Cutter,et al.  Automated Detection of Rockfish in Unconstrained Underwater Videos Using Haar Cascades and a New Image Dataset: Labeled Fishes in the Wild , 2015, 2015 IEEE Winter Applications and Computer Vision Workshops.

[31]  Charles X. Ling,et al.  Pelee: A Real-Time Object Detection System on Mobile Devices , 2018, NeurIPS.

[32]  Shifeng Zhang,et al.  Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jenq-Neng Hwang,et al.  A Feature Learning and Object Recognition Framework for Underwater Fish Images , 2016, IEEE Transactions on Image Processing.

[34]  Xinting Yang,et al.  Deep learning for smart fish farming: applications, opportunities and challenges , 2020, Reviews in Aquaculture.

[35]  Jieping Ye,et al.  Object Detection in 20 Years: A Survey , 2019, Proceedings of the IEEE.

[36]  Shan Liu,et al.  ROIMIX: Proposal-Fusion Among Multiple Images for Underwater Object Detection , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Andreas Kamilaris,et al.  Deep learning in agriculture: A survey , 2018, Comput. Electron. Agric..

[38]  Hervé Glotin,et al.  LifeCLEF 2016: Multimedia Life Species Identification Challenges , 2016, CLEF.

[39]  Kai Chen,et al.  CARAFE: Content-Aware ReAssembly of FEatures , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Ajmal Mian,et al.  Fish detection and species classification in underwater environments using deep learning with temporal information , 2020, Ecol. Informatics.

[42]  Hong Liu,et al.  Towards Domain Generalization In Underwater Object Detection , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[43]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Wesley Nunes Gonçalves,et al.  Improving Pantanal fish species recognition through taxonomic ranks in convolutional neural networks , 2019, Ecol. Informatics.

[46]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Xiu Li,et al.  Deep but lightweight neural networks for fish detection , 2017, OCEANS 2017 - Aberdeen.

[48]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Li Wen,et al.  Reveal of Domain Effect: How Visual Restoration Contributes to Object Detection in Aquatic Scenes , 2020, ArXiv.

[50]  Heesung Kwon,et al.  ME R-CNN: Multi-Expert R-CNN for Object Detection , 2017, IEEE Transactions on Image Processing.

[51]  Bernt Schiele,et al.  What Makes for Effective Detection Proposals? , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[53]  Hervé Glotin,et al.  LifeCLEF 2017 Lab Overview: Multimedia Species Identification Challenges , 2017, CLEF.