Multi-scale ResNet for real-time underwater object detection

An automatic underwater object recognition system is essential to reduce the costs of underwater inspection. In this study, we propose a novel convolutional neural network architecture that is trained on underwater video frames. This method is based on a modified residual neural network (ResNet) for underwater object detection. Multi-scale ResNet (M-ResNet), the modified method, improves efficiency by utilizing multi-scale operations for the accurate detection of objects of various sizes, especially small objects. The experimental results show that the proposed method yields an accuracy of 96.5% (mAP) in recognition performance. As a consequence, we propose a novel system for automatic object detection as an application for marine environments.

[1]  Dean Zhao,et al.  Real-time robust detector for underwater live crabs based on deep learning , 2020, Comput. Electron. Agric..

[2]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Ping Liu,et al.  Research on underwater object recognition based on YOLOv3 , 2020, Microsystem Technologies.

[4]  Hazar Mliki,et al.  An improved multi-scale face detection using convolutional neural network , 2020, Signal Image Video Process..

[5]  Amos J. Storkey,et al.  Augmenting Image Classifiers Using Data Augmentation Generative Adversarial Networks , 2018, ICANN.

[6]  Thomas B. Moeslund,et al.  Detection of Marine Animals in a New Underwater Dataset with Varying Visibility , 2019, CVPR Workshops.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[10]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[11]  Khawar Khurshid,et al.  Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system , 2019, ICES Journal of Marine Science.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[14]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[15]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yi Zhang,et al.  PSANet: Point-wise Spatial Attention Network for Scene Parsing , 2018, ECCV.

[17]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[19]  Md Jahidul Islam,et al.  Enhancing Underwater Imagery Using Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Peng Liu,et al.  Real-Time Object Detection for AUVs Using Self-Cascaded Convolutional Neural Networks , 2021, IEEE Journal of Oceanic Engineering.

[22]  Lorenzo Porzi,et al.  In-place Activated BatchNorm for Memory-Optimized Training of DNNs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[24]  Ahmad Salman,et al.  Real-time fish detection in complex backgrounds using probabilistic background modelling , 2019, Ecol. Informatics.

[25]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  R. Priyadharsini,et al.  Object Detection In Underwater Acoustic Images Using Edge Based Segmentation Method , 2019, Procedia Computer Science.

[28]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[29]  Xin Wang,et al.  Marathon athletes number recognition model with compound deep neural network , 2020, Signal Image Video Process..