Affinity LCFCN: Learning to Segment Fish with Weak Supervision

Aquaculture industries rely on the availability of accurate fish body measurements, e.g., length, width and mass. Manual methods that rely on physical tools like rulers are time and labour intensive. Leading automatic approaches rely on fully-supervised segmentation models to acquire these measurements but these require collecting per-pixel labels -- also time consuming and laborious: i.e., it can take up to two minutes per fish to generate accurate segmentation labels, almost always requiring at least some manual intervention. We propose an automatic segmentation model efficiently trained on images labeled with only point-level supervision, where each fish is annotated with a single click. This labeling process requires significantly less manual intervention, averaging roughly one second per fish. Our approach uses a fully convolutional neural network with one branch that outputs per-pixel scores and another that outputs an affinity matrix. We aggregate these two outputs using a random walk to obtain the final, refined per-pixel segmentation output. We train the entire model end-to-end with an LCFCN loss, resulting in our A-LCFCN method. We validate our model on the DeepFish dataset, which contains many fish habitats from the north-eastern Australian region. Our experimental results confirm that A-LCFCN outperforms a fully-supervised segmentation model at fixed annotation budget. Moreover, we show that A-LCFCN achieves better segmentation results than LCFCN and a standard baseline. We have released the code at \url{this https URL}.

[1]  Issam H. Laradji,et al.  A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis , 2020, Scientific Reports.

[2]  Saturnino Maldonado-Bascón,et al.  Extremely Overlapping Vehicle Counting , 2015, IbPRIA.

[3]  Chuang Yu,et al.  Segmentation and measurement scheme for fish morphological features based on Mask R-CNN , 2020 .

[4]  A. Zugarramurdi,et al.  Efficiency of size sorting of fish , 1997 .

[5]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  William Parker,et al.  A Weakly Supervised Region-Based Active Learning Method for COVID-19 Segmentation in CT Images , 2020, ArXiv.

[7]  Suha Kwak,et al.  Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Suha Kwak,et al.  Learning Pixel-Level Semantic Affinity with Image-Level Supervision for Weakly Supervised Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Dmitry A. Konovalov,et al.  Automatic Weight Estimation of Harvested Fish from Images , 2019, 2019 Digital Image Computing: Techniques and Applications (DICTA).

[10]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yu Zhou,et al.  Fish Detection Using Deep Learning , 2020, Appl. Comput. Intell. Soft Comput..

[12]  Paul Vernaza,et al.  Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Arthur F. A. Fernandes,et al.  Deep Learning image segmentation for extraction of fish body measurements and prediction of body weight and carcass traits in Nile tilapia , 2020, Comput. Electron. Agric..

[14]  Norval J. C. Strachan,et al.  Length measurement of fish by computer vision , 1993 .

[15]  Mark W. Rosegrant,et al.  Book Review: Fish to 2020: Supply and Demand in Changing Global Markets , 2003 .

[16]  Juergen Gall,et al.  Convolutional Simplex Projection Network for Weakly Supervised Semantic Segmentation , 2018, BMVC.

[17]  John A. Marchant,et al.  Predicting salmon biomass remotely using a digital stereo-imaging technique , 1996 .

[18]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Anton van den Hengel,et al.  Wider or Deeper: Revisiting the ResNet Model for Visual Recognition , 2016, Pattern Recognit..

[20]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[21]  Mark Fisher,et al.  Convolutional Neural Networks for Counting Fish in Fisheries Surveillance Video , 2015 .

[22]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[23]  Rafael Garcia,et al.  Automatic segmentation of fish using deep learning with application to fish size measurement , 2020, ICES Journal of Marine Science.

[24]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Mark W. Schmidt,et al.  Instance Segmentation with Point Supervision , 2019, ArXiv.

[26]  Mark W. Schmidt,et al.  Where are the Masks: Instance Segmentation with Image-level Supervision , 2019, BMVC.

[27]  Dmitry A. Konovalov,et al.  Estimating Mass of Harvested Asian Seabass Lates calcarifer from Images , 2018 .

[28]  Fei-Fei Li,et al.  What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[29]  Daoliang Li,et al.  The Measurement of Fish Size by Machine Vision - A Review , 2015, CCTA.

[30]  Mark W. Schmidt,et al.  Where are the Blobs: Counting by Localization with Point Supervision , 2018, ECCV.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[33]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[34]  Qingming Huang,et al.  F3Net: Fusion, Feedback and Focus for Salient Object Detection , 2019, AAAI.

[35]  Zhao Yun,et al.  Application of machine vision technique to automatic quality identification of agricultural products (I). , 2000 .

[36]  Issam H. Laradji,et al.  Looc: Localize Overlapping Objects with Count Supervision , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[37]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[38]  Alan L. Yuille,et al.  Learning Deep Structured Models , 2014, ICML.

[39]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40]  Zaïd Harchaoui,et al.  On learning to localize objects with minimal supervision , 2014, ICML.

[41]  Jan Kautz,et al.  Learning Affinity via Spatial Propagation Networks , 2017, NIPS.

[42]  Trevor Darrell,et al.  Learning to Segment Every Thing , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[44]  Yoshua Bengio,et al.  The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[45]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[46]  B.J.P. Berges,et al.  Practical implementation of real-time fish classification from acoustic broadband echo sounder data- RealFishEcho progress report : Year 1-June 2017 , 2017 .

[47]  Yong Jae Lee,et al.  Weakly-supervised Discovery of Visual Pattern Configurations , 2014, NIPS.

[48]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[49]  Ajmal Mian,et al.  Fish detection and species classification in underwater environments using deep learning with temporal information , 2020, Ecol. Informatics.

[50]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Xiaoxiao Li,et al.  Semantic Image Segmentation via Deep Parsing Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52]  Stella X. Yu,et al.  Affinity CNN: Learning Pixel-Centric Pairwise Relations for Figure/Ground Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Eric Gilman,et al.  A third assessment of global marine fisheries discards , 2019 .