CentroidNetV2: A hybrid deep neural network for small-object segmentation and counting

Abstract This paper presents CentroidNetV2, a novel hybrid Convolutional Neural Network (CNN) that has been specifically designed to segment and count many small and connected object instances. This complete redesign of the original CentroidNet uses a CNN backbone to regress a field of centroid-voting vectors and border-voting vectors. The segmentation masks of the individual object instances are produced by decoding centroid votes and border votes. A loss function that combines cross-entropy loss and Euclidean-distance loss achieves high quality centroids and borders of object instances. Several backbones and loss functions are tested on three different datasets ranging from precision agriculture to microbiology and pathology. CentroidNetV2 is compared to the state-of-the art networks You Only Look Once Version 3 (YOLOv3) and Mask Recurrent Convolutional Neural Network (MRCNN). On two out of three datasets CentroidNetV2 achieves the highest F1 score and on all three datasets CentroidNetV2 achieves the highest recall. CentroidNetV2 demonstrates the best ability to detect small objects although the best segmentation masks for larger objects are produced by MRCNN.

[1]  Mehmet Karaköse,et al.  An Image Processing based Object Counting Approach for Machine Vision Application , 2018, arXiv.org.

[2]  Xiren Miao,et al.  Distribution Line Pole Detection and Counting Based on YOLO Using UAV Inspection Line Video , 2019, Journal of Electrical Engineering & Technology.

[3]  Yoshua Bengio,et al.  Count-ception: Counting by Fully Convolutional Redundant Counting , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[4]  Andrew Zisserman,et al.  Microscopy cell counting and detection with fully convolutional regression networks , 2018, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[5]  Alberto Signoroni,et al.  Bacterial colony counting by Convolutional Neural Networks , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[6]  Eugene W. Myers,et al.  Cell Detection with Star-convex Polygons , 2018, MICCAI.

[7]  Wei Fang,et al.  A novel YOLO-Based real-time people counting approach , 2017, 2017 International Smart Cities Conference (ISC2).

[8]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[9]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[10]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[11]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[12]  Winston H. Hsu,et al.  Drone-Based Object Counting by Spatially Regularized Regional Proposal Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  P. Alam ‘G’ , 2021, Composites Engineering: An A–Z Guide.

[14]  Yunchao Wei,et al.  Proposal-Free Network for Instance-Level Object Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Lambert Schomaker,et al.  Hyperspectral demosaicking and crosstalk correction using deep learning , 2018, Machine Vision and Applications.

[17]  Philip H. S. Torr,et al.  Straight to Shapes: Real-Time Detection of Encoded Shapes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Silvia L. Pintea,et al.  Divide and Count: Generic Object Counting by Image Divisions , 2019, IEEE Transactions on Image Processing.

[19]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Ivo Wolf,et al.  AutoCellSeg: robust automatic colony forming unit (CFU)/cell analysis using adaptive image segmentation and easy-to-use post-editing techniques , 2018, Scientific Reports.

[21]  Qi Wang,et al.  PCC Net: Perspective Crowd Counting via Spatial Convolutional Network , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Feiping Nie,et al.  Detecting Coherent Groups in Crowd Scenes by Multiview Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Anton van den Hengel,et al.  Bridging Category-level and Instance-level Semantic Image Segmentation , 2016, ArXiv.

[24]  Min Bai,et al.  Deep Watershed Transform for Instance Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[26]  Siddharth Swarup Rautaray,et al.  Application of Deep Learning for Object Detection , 2018 .

[27]  Duy-Dinh Le,et al.  An Evaluation of Deep Learning Methods for Small Object Detection , 2020, J. Electr. Comput. Eng..

[28]  Baoyuan Wu,et al.  Residual Regression With Semantic Prior for Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Hao Chen,et al.  DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Comparison of Inoculation with the InoqulA and WASP Automated Systems with Manual Inoculation , 2015, Journal of Clinical Microbiology.

[31]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[32]  Richard S. Zemel,et al.  End-to-End Instance Segmentation with Recurrent Attention , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Lambert Schomaker,et al.  CentroidNet: A Deep Neural Network for Joint Object Localization and Counting , 2018, ECML/PKDD.

[34]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[35]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Yang Wang,et al.  Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation , 2016, ISVC.

[37]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Marco Wiering,et al.  Deep Neural Networks with Intersection over Union Loss for Binary Image Segmentation , 2019, ICPRAM.

[39]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[40]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[41]  P. Alam,et al.  R , 1823, The Herodotus Encyclopedia.

[42]  Yann LeCun,et al.  Convolutional nets and watershed cuts for real-time semantic Labeling of RGBD videos , 2014, J. Mach. Learn. Res..

[43]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.