Weakly Supervised Semantic Segmentation Using Superpixel Pooling Network

We propose a weakly supervised semantic segmentation algorithm based on deep neural networks, which relies on imagelevel class labels only. The proposed algorithm alternates between generating segmentation annotations and learning a semantic segmentation network using the generated annotations. A key determinant of success in this framework is the capability to construct reliable initial annotations given image-level labels only. To this end, we propose Superpixel Pooling Network (SPN), which utilizes superpixel segmentation of input image as a pooling layout to reflect low-level image structure for learning and inferring semantic segmentation. The initial annotations generated by SPN are then used to learn another neural network that estimates pixelwise semantic labels. The architecture of the segmentation network decouples semantic segmentation task into classification and segmentation so that the network learns classagnostic shape prior from the noisy annotations. It turns out that both networks are critical to improve semantic segmentation accuracy. The proposed algorithm achieves outstanding performance in weakly supervised semantic segmentation task compared to existing techniques on the challenging PASCAL VOC 2012 segmentation benchmark.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Jian Sun,et al.  ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Trevor Darrell,et al.  Fully Convolutional Multi-Class Multiple Instance Learning , 2014, ICLR.

[5]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[6]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[8]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Seunghoon Hong,et al.  Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation , 2015, NIPS.

[12]  한보형,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015 .

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[15]  한보형 Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network , 2016 .

[16]  Guo-Jun Qi,et al.  Hierarchically Gated Deep Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18]  Trevor Darrell,et al.  Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[20]  Rama Chellappa,et al.  Gaussian Conditional Random Field Network for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jian Sun,et al.  BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Ivan Laptev,et al.  Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[24]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[25]  Ronan Collobert,et al.  From image-level to pixel-level labeling with Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).