Information-Theoretic Segmentation by Inpainting Error Maximization

We study image segmentation from an information-theoretic perspective, proposing a novel adversarial method that performs unsupervised segmentation by partitioning images into maximally independent sets. More specifically, we group image pixels into foreground and background, with the goal of minimizing predictability of one set from the other. An easily computed loss drives a greedy search process to maximize inpainting error over these partitions. Our method does not involve training deep networks, is computationally cheap, class-agnostic, and even applicable in isolation to a single unlabeled image. Experiments demonstrate that it achieves a new state-of-the-art in unsupervised segmentation quality, while being substantially faster and more general than competing approaches.1

[1]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[4]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[5]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[6]  Shi-Min Hu,et al.  Global Contrast Based Salient Region Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[8]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Gregory Shakhnarovich,et al.  Colorization as a Proxy Task for Visual Understanding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Jeff Donahue,et al.  Large Scale Adversarial Representation Learning , 2019, NeurIPS.

[14]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[15]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[16]  Xu Ji,et al.  Invariant Information Clustering for Unsupervised Image Classification and Segmentation , 2019 .

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[21]  Artem Babenko,et al.  Big GANs Are Watching You: Towards Unsupervised Object Segmentation with Off-the-Shelf Generative Models , 2020, ArXiv.

[22]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[23]  C. H. Coombs,et al.  Mathematical psychology : an elementary introduction , 1970 .

[25]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[26]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[28]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Jian Sun,et al.  BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[33]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[35]  Stefano Soatto,et al.  Unsupervised Moving Object Detection via Contextual Information Separation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[38]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Jianbo Shi,et al.  DeepEdge: A multi-scale bifurcated deep network for top-down contour detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Xu Ji,et al.  Invariant Information Clustering for Unsupervised Image Classification and Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Yan Wang,et al.  DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Armand Joulin,et al.  Unsupervised Learning by Predicting Noise , 2017, ICML.

[45]  Asako Kanezaki,et al.  Unsupervised Image Segmentation by Backpropagation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[48]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[50]  Trevor Darrell,et al.  Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[51]  Michal Valko,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[52]  Huimin Ma,et al.  Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[55]  Huchuan Lu,et al.  Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[56]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[57]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[58]  Jian Dong,et al.  Semantic Segmentation without Annotating Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[59]  Fred A. Hamprecht,et al.  Inpainting Networks Learn to Separate Cells in Microscopy Images , 2020, BMVC.

[60]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[61]  Adam Bielski,et al.  Emergence of Object Segmentation in Perturbed Generative Models , 2019, NeurIPS.

[62]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Bernt Schiele,et al.  Simple Does It: Weakly Supervised Instance and Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Iasonas Kokkinos,et al.  Pushing the Boundaries of Boundary Detection using Deep Learning , 2015, ICLR 2016.

[65]  Wenyu Liu,et al.  Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[67]  Stefano Soatto,et al.  Time-Supervised Primary Object Segmentation , 2020, ArXiv.

[68]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[69]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[70]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[71]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[73]  J. E. Glynn,et al.  Numerical Recipes: The Art of Scientific Computing , 1989 .

[74]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[75]  Ludovic Denoyer,et al.  Unsupervised Object Segmentation by Redrawing , 2019, NeurIPS.

[76]  Andrew Zisserman,et al.  Delving into the Whorl of Flower Segmentation , 2007, BMVC.

[77]  Seong Joon Oh,et al.  Exploiting Saliency for Object Segmentation from Image Level Labels , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).