Semantic image segmentation using region-based object detector

Semantic image segmentation plays a central role in image understanding and could be applied in various applications. With the development of deep learning techniques, great progress has been made in recent years. However, it is still very challenging to produce accurate segmentation results especially near object boundaries. Furthermore, all the methods with top performances are contributed by pixel-level annotations which needs expensive human effort and tremendous labeling time. In this paper, we explore a simple semantic segmentation approach using region-based object detector which only needs bounding box annotations. The main idea is using object detector to classify region proposals and then applying saliency detection method to segment such classified proposals. Experimental results on PASCAL VOC 2012 validation dataset show its comparable performance with fully supervised methods by masks.

[1]  Bernt Schiele,et al.  What Makes for Effective Detection Proposals? , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[3]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[6]  Jian Sun,et al.  ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Frank Keller,et al.  Training Object Class Detectors from Eye Tracking Data , 2014, ECCV.

[8]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[9]  Fei-Fei Li,et al.  What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[10]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Cewu Lu,et al.  Complexity-adaptive distance metric for object proposals generation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jonathan T. Barron,et al.  Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Xuelong Hu,et al.  Saliency detection for improving object proposals , 2016, 2016 IEEE International Conference on Digital Signal Processing (DSP).

[14]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  한보형,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015 .

[16]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[18]  Jian Sun,et al.  BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jonathan T. Barron,et al.  Multiscale Combinatorial Grouping , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Honglak Lee,et al.  Object Contour Detection with a Fully Convolutional Encoder-Decoder Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Bernt Schiele,et al.  Weakly Supervised Semantic Labelling and Instance Segmentation , 2016, ArXiv.

[24]  Ejaz Ahmed,et al.  Semantic Object Selection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2019, Computational Visual Media.

[26]  Jianbo Shi,et al.  Convolutional Random Walk Networks for Semantic Image Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).