A Pixel-Level Meta-Learner for Weakly Supervised Few-Shot Semantic Segmentation

Few-shot semantic segmentation addresses the learning task in which only few images with ground truth pixel-level labels are available for the novel classes of interest. One is typically required to collect a large mount of data (i.e., base classes) with such ground truth information, followed by meta-learning strategies to address the above learning task. When only image-level semantic labels can be observed during both training and testing, it is considered as an even more challenging task of weakly supervised few-shot semantic segmentation. To address this problem, we propose a novel meta-learning framework, which predicts pseudo pixel-level segmentation masks from a limited amount of data and their semantic labels. More importantly, our learning scheme further exploits the produced pixel-level information for query image inputs with segmentation guarantees. Thus, our proposed learning model can be viewed as a pixel-level meta-learner. Through extensive experiments on benchmark datasets, we show that our model achieves satisfactory performances under fully supervised settings, yet performs favorably against state-ofthe-art methods under weakly supervised settings.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Yizhou Yu,et al.  FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation , 2019, ArXiv.

[3]  Jia Xu,et al.  Tell Me What You See and I Will Show You Where It Is , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Xiantong Zhen,et al.  Few-Shot Semantic Segmentation with Democratic Attention Networks , 2020, ECCV.

[6]  Mahdyar Ravanbakhsh,et al.  Weakly Supervised One Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[7]  Wenyu Liu,et al.  Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Xiao Liu,et al.  Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Joachim M. Buhmann,et al.  Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[11]  Xuming He,et al.  Part-aware Prototype Network for Few-shot Semantic Segmentation , 2020, ECCV.

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Xiaoming Wei,et al.  Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[15]  B. S. Manjunath,et al.  Weakly Supervised Graph Based Semantic Segmentation by Learning Communities of Image-Parts , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Khoi Nguyen,et al.  Feature Weighting and Boosting for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[18]  Rui Yao,et al.  CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Byron Boots,et al.  One-Shot Learning for Semantic Segmentation , 2017, BMVC.

[20]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[24]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[25]  Alexei A. Efros,et al.  Conditional Networks for Few-Shot Semantic Segmentation , 2018, ICLR.

[26]  Varun Jampani,et al.  Adaptive Prototype Learning and Allocation for Few-Shot Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Hengshuang Zhao,et al.  Prior Guided Feature Enrichment Network for Few-Shot Segmentation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Qixiang Ye,et al.  Prototype Mixture Models for Few-shot Semantic Segmentation , 2020, ECCV.

[29]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[32]  Jia Xu,et al.  Learning to segment under various forms of weak supervision , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[34]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Michael Fink,et al.  Object Classification from a Single Example Utilizing Class Relevance Metrics , 2004, NIPS.

[37]  Jiashi Feng,et al.  PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Gang Yu,et al.  Attention-Based Multi-Context Guiding for Few-Shot Semantic Segmentation , 2019, AAAI.

[39]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[40]  Zhuowen Tu,et al.  Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[42]  Martin Jägersand,et al.  AMP: Adaptive Masked Proxies for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Sheng Zeng,et al.  Weakly supervised semantic segmentation for social images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Boris N. Oreshkin,et al.  Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Inputs , 2020, ArXiv.

[45]  Joachim M. Buhmann,et al.  Weakly supervised structured output learning for semantic segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[47]  Matthieu Cord,et al.  Zero-Shot Semantic Segmentation , 2019, NeurIPS.

[48]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Siddhartha Gairola,et al.  SimPropNet: Improved Similarity Propagation for Few-shot Image Segmentation , 2020, IJCAI.

[50]  Eric P. Xing,et al.  Few-Shot Semantic Segmentation with Prototype Learning , 2018, BMVC.

[51]  Guosheng Lin,et al.  CRNet: Cross-Reference Networks for Few-Shot Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[54]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[55]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[56]  Yi Yang,et al.  SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation , 2018, IEEE Transactions on Cybernetics.

[57]  Bartunov Sergey,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[58]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.