Underwater Fish Detection with Weak Multi-Domain Supervision

Given a sufficiently large training dataset, it is relatively easy to train a modern convolution neural network (CNN) as a required image classifier. However, for the task of fish classification and/or fish detection, if a CNN was trained to detect or classify particular fish species in particular background habitats, the same CNN exhibits much lower accuracy when applied to new/unseen fish species and/or fish habitats. Therefore, in practice, the CNN needs to be continuously fine-tuned to improve its classification accuracy to handle new project-specific fish species or habitats. In this work we present a labelling-efficient method of training a CNN-based fish-detector (the Xception CNN was used as the base) on relatively small numbers (4,000) of project-domain underwater fish/no-fish images from 20 different habitats. Additionally, 17,000 of known negative (that is, missing fish) general-domain (VOC2012) above-water images were used. Two publicly available fish-domain datasets supplied additional 27,000 of above-water and underwater positive/fish images. By using this multi-domain collection of images, the trained Xception-based binary (fish/not-fish) classifier achieved 0.17% false-positives and 0.61% false-negatives on the project’s 20,000 negative and 16,000 positive holdout test images, respectively. The area under the ROC curve (AUC) was 99.94%.

[1]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Ajmal S. Mian,et al.  Face Recognition Using Sparse Approximated Nearest Points between Image Sets , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Gary A. Kendrick,et al.  Coastal Fish Assemblages Reflect Geological and Oceanographic Gradients Within An Australian Zootone , 2013, PloS one.

[4]  Marc Van Droogenbroeck,et al.  ViBe: A Universal Background Subtraction Algorithm for Video Sequences , 2011, IEEE Transactions on Image Processing.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Muhammad Imran Malik,et al.  Automatic fish species classification in underwater videos: exploiting pre-trained deep neural network models to compensate for limited labelled data , 2018 .

[7]  Phil F. Culverhouse,et al.  Fish identification from videos captured in uncontrolled underwater environments , 2016 .

[8]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[9]  Ajmal Mian,et al.  Fish species classification in unconstrained underwater environments based on deep learning , 2016 .

[10]  Ferdinand van der Heijden,et al.  Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[11]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Frédéric Precioso,et al.  Fish Species Recognition from Video using SVM Classifier , 2014, MAED '14.

[13]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[15]  Robert B. Fisher,et al.  Detecting, Tracking and Counting Fish in Low Quality Unconstrained Underwater Videos , 2008, VISAPP.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Juan M. Corchado,et al.  Detection of Cattle Using Drones and Convolutional Neural Networks , 2018, Sensors.

[18]  Hervé Glotin,et al.  LifeCLEF 2014: Multimedia Life Species Identification Challenges , 2014, CLEF.

[19]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[20]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[21]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Hervé Glotin,et al.  Fine-grained object recognition in underwater visual data , 2016, Multimedia Tools and Applications.

[24]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[25]  Karel J. Zuiderveld,et al.  Contrast Limited Adaptive Histogram Equalization , 1994, Graphics Gems.

[26]  Xiu Li,et al.  Fast accurate fish detection and recognition of underwater images with Fast R-CNN , 2015, OCEANS 2015 - MTS/IEEE Washington.

[27]  Simone Marini,et al.  Tracking Fish Abundance by Underwater Image Recognition , 2018, Scientific Reports.

[28]  Marcus Sheaves,et al.  Context is more important than habitat type in determining use by juvenile fish , 2019, Landscape Ecology.

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Chaur-Chin Chen,et al.  Real-world underwater fish recognition and identification, using sparse representation , 2014, Ecol. Informatics.

[31]  Simone Marini,et al.  Looking inside the Ocean: Toward an Autonomous Imaging System for Monitoring Gelatinous Zooplankton , 2016, Sensors.

[32]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[33]  Fei-Fei Li,et al.  What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[34]  Peter I. Corke,et al.  Local inter-session variability modelling for object classification , 2014, IEEE Winter Conference on Applications of Computer Vision.

[35]  Mark W. Schmidt,et al.  Where are the Blobs: Counting by Localization with Point Supervision , 2018, ECCV.

[36]  Robert B. Fisher,et al.  A research tool for long-term and continuous analysis of fish assemblage in coral-reefs using underwater camera footage , 2014, Ecol. Informatics.