A deep neural network for multi-species fish detection using multiple acoustic cameras

Underwater acoustic cameras are high potential devices for many applications in ecology, notably for fisheries management and monitoring. However how to extract such data into high value information without a time-consuming entire dataset reading by an operator is still a challenge. Moreover the analysis of acoustic imaging, due to its low signal-to-noise ratio, is a perfect training ground for experimenting with new approaches, especially concerning Deep Learning techniques. We present hereby a novel approach that takes advantage of both CNN (Convolutional Neural Network) and classical CV (Computer Vision) techniques, able to detect a generic class “fish” in acoustic video streams. The pipeline pre-treats the acoustic images to extract 2 features, in order to localise the signals and improve the detection performances. To ensure the performances from an ecological point of view, we propose also a two-step validation, one to validate the results of the trainings and one to test the method on a real-world scenario. The YOLOv3-based model was trained with data of fish from multiple species recorded by the two common acoustic cameras, DIDSON and ARIS, including species of high ecological interest, as Atlantic salmon or European eels. The model we developed provides satisfying results detecting almost 80% of fish and minimizing the false positive rate, however the model is much less efficient for eel detections on ARIS videos. The first CNN pipeline for fish monitoring exploiting video data from two models of acoustic cameras satisfies most of the required features. Many challenges are still present, such as the automation of fish species identification through a multiclass model. 1 However the results point a new solution for dealing with complex data, such as sonar data, which can also be reapplied in other cases where the signal-to-noise ratio is a challenge.

[1]  Peter Jaksons,et al.  Validation of fish length estimations from a high frequency multi-beam sonar (ARIS) and its utilisation as a field-based measurement technique , 2019, Fisheries Research.

[2]  Mark S. Nixon,et al.  Feature extraction & image processing for computer vision , 2012 .

[3]  T. McCarthy,et al.  Assessment of silver eel (Anguilla anguilla) route selection at a water-regulating weir using an acoustic camera , 2020 .

[4]  Hongsheng Bi,et al.  Sonar imaging surveys fill data gaps in forage fish populations in shallow estuarine tributaries , 2020 .

[5]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Ole Ravn,et al.  Deep Learning based Segmentation of Fish in Noisy Forward Looking MBES Images , 2020, IFAC-PapersOnLine.

[7]  Jean-Luc Baglinière,et al.  The use of acoustic cameras in shallow waters: new hydroacoustic tools for monitoring migratory fish population. A review of DIDSON technology , 2015 .

[8]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[9]  Yiquan Wu,et al.  Recent advances in small object detection based on deep learning: A review , 2020, Image Vis. Comput..

[10]  Zhihai He,et al.  Spatially supervised recurrent convolutional neural networks for visual object tracking , 2016, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[11]  T. Linnansaari,et al.  Object and behavior differentiation for improved automated counts of migrating river fish using imaging sonar data , 2021 .

[12]  Eric Hervet,et al.  Applications for deep learning in ecology , 2018, bioRxiv.

[13]  Nicola Secciani,et al.  Forward-Looking Sonar CNN-based Automatic Target Recognition: an experimental campaign with FeelHippo AUV , 2020, 2020 IEEE/OES Autonomous Underwater Vehicles Symposium (AUV)(50043).

[14]  Lenka Zdeborová,et al.  Understanding deep learning is also a job for physicists , 2020, Nature Physics.

[15]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Paris Perdikaris,et al.  Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , 2019, J. Comput. Phys..

[17]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Ruoyu Sun,et al.  Optimization for deep learning: theory and algorithms , 2019, ArXiv.

[19]  Xueliang Zhang,et al.  Deep learning in remote sensing applications: A meta-analysis and review , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Houshang Darabi,et al.  LSTM Fully Convolutional Networks for Time Series Classification , 2017, IEEE Access.

[24]  Ferdinand van der Heijden,et al.  Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[25]  Margaret Kosmala,et al.  Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning , 2017, Proceedings of the National Academy of Sciences.

[26]  David W. Daum,et al.  Use of Fixed-Location, Split-Beam Sonar to Describe Temporal and Spatial Patterns of Adult Fall Chum Salmon Migration in the Chandalar River, Alaska , 1998 .

[27]  Hanyu Wang,et al.  LSTM-CNN Architecture for Human Activity Recognition , 2020, IEEE Access.

[28]  Sejin Lee,et al.  Deep Learning from Shallow Dives: Sonar Image Generation and Training for Underwater Object Detection , 2018, ArXiv.

[29]  Hongsheng Bi,et al.  Detecting a nearshore fish parade using the adaptive resolution imaging sonar (ARIS): An automated procedure for data analysis , 2017 .

[30]  Ben. G. Weinstein A computer vision for animal ecology. , 2018, The Journal of animal ecology.

[31]  Huadong Guo,et al.  Big Earth data: A new frontier in Earth and information sciences , 2017 .

[32]  Donna M. Kocak,et al.  A Focus on Recent Developments and Trends in Underwater Imaging , 2008 .

[33]  Bhiksha Raj,et al.  On the Origin of Deep Learning , 2017, ArXiv.

[34]  Sergio Escalera,et al.  Deep learning with self-supervision and uncertainty regularization to count fish in underwater images , 2021, ArXiv.

[35]  E. Faliex,et al.  In situ evaluation of European eel counts and length estimates accuracy from an acoustic camera (ARIS) , 2020, Knowledge and Management of Aquatic Ecosystems.

[36]  Huimin Lu,et al.  Underwater Optical Image Processing: a Comprehensive Review , 2017, Mob. Networks Appl..

[37]  Matias Valdenegro-Toro,et al.  Submerged marine debris detection with autonomous underwater vehicles , 2016, 2016 International Conference on Robotics and Automation for Humanitarian Applications (RAHA).

[38]  Khawar Khurshid,et al.  Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system , 2019, ICES Journal of Marine Science.

[39]  Simon Goring,et al.  Situating Ecology as a Big-Data Science: Current Advances, Challenges, and Solutions , 2018, BioScience.

[40]  Kenneth G. Foote,et al.  Acoustic Methods: Brief Review and Prospects for Advancing Fisheries Research , 2009 .

[41]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Nicolas Audebert,et al.  Deep Learning for Classification of Hyperspectral Data: A Comparative Review , 2019, IEEE Geoscience and Remote Sensing Magazine.

[43]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[44]  Roberta Kwok,et al.  Deep learning powers a motion-tracking revolution , 2019, Nature.

[45]  Fish Monitoring through a Fish Run on the Nakdong River using an Acoustic Camera System , 2010 .

[46]  Pietro Perona,et al.  Recognition in Terra Incognita , 2018, ECCV.

[47]  B. Everitt,et al.  Large sample standard errors of kappa and weighted kappa. , 1969 .

[48]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[49]  Thomas S. Huang,et al.  A fast two-dimensional median filtering algorithm , 1979 .

[50]  Edward R. Dougherty,et al.  Hands-on Morphological Image Processing , 2003 .

[51]  Graham W. Taylor,et al.  Deep Learning Object Detection Methods for Ecological Camera Trap Data , 2018, 2018 15th Conference on Computer and Robot Vision (CRV).

[52]  Mikhail Belkin,et al.  Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.

[53]  C H Chen,et al.  Handbook of Pattern Recognition and Computer Vision, 5th Ed , 2016, Handbook of Pattern Recognition and Computer Vision.

[54]  Paul W. Webb,et al.  Locomotor Patterns in the Evolution of Actinopterygian Fishes , 1982 .

[55]  Synho Do,et al.  How much data is needed to train a medical image deep learning system to achieve necessary high accuracy , 2015, 1511.06348.

[56]  R. V. Hal,et al.  Migration of silver eel, Anguilla anguilla , through three water pumping stations in The Netherlands , 2020 .

[57]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Wei Song,et al.  An Experimental-Based Review of Image Enhancement and Image Restoration Methods for Underwater Imaging , 2019, IEEE Access.

[59]  Mathieu Bonneau,et al.  Outdoor animal tracking combining neural network and time-lapse cameras , 2020, Comput. Electron. Agric..

[60]  John Joseph Valletta,et al.  Applications of machine learning in animal behaviour studies , 2017, Animal Behaviour.