Deep, Dense, and Low-Rank Gaussian Conditional Random Fields

In this work we introduce a fully-connected graph structure in the Deep Gaussian Conditional Random Field (G-CRF) model. For this we express the pairwise interactions between pixels as the inner-products of low-dimensional embeddings, delivered by a new subnetwork of a deep architecture. We efficiently minimize the resulting energy by solving the resulting low-rank linear system with conjugate gradients, and derive an analytic expression for the gradient of our embeddings which allows us to train them end-to-end with backpropagation. We demonstrate the merit of our approach by achieving state of the art results on three challenging Computer Vision benchmarks, namely semantic segmentation, human parts segmentation, and saliency estimation. Our implementation is fully GPU based, built on top of the Caffe library, and will be made publicly available.

[1]  Liang Lin,et al.  PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures With Edge-Preserving Coherence , 2015, IEEE Transactions on Image Processing.

[2]  Cristian Sminchisescu,et al.  Training Deep Networks with Structured Layers by Matrix Backpropagation , 2015, ArXiv.

[3]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Vibhav Vineet,et al.  Improved Initialization and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference , 2012, BMVC.

[5]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[6]  Xiaogang Wang,et al.  Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  James M. Rehg,et al.  The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Edward H. Adelson,et al.  Learning Gaussian Conditional Random Fields for Low-Level Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[12]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Huchuan Lu,et al.  Saliency detection via Cellular Automata , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Rama Chellappa,et al.  Gaussian Conditional Random Field Network for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Xiaoxiao Li,et al.  Semantic Image Segmentation via Deep Parsing Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Sanja Fidler,et al.  Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Iasonas Kokkinos,et al.  Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs , 2016, ECCV.

[18]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[20]  Huchuan Lu,et al.  Deep networks for saliency detection via local estimation and global search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Alan L. Yuille,et al.  Learning Deep Structured Models , 2014, ICML.

[22]  Iasonas Kokkinos,et al.  ÜberNet: a ‘Universal’ CNN for the Joint Treatment of Low-, Mid-, and High-Level Vision Problems , 2016 .

[23]  Yi Yang,et al.  Attention to Scale: Scale-Aware Semantic Image Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jan Kautz,et al.  Fully-Connected CRFs with Non-Parametric Pairwise Potential , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[27]  Shuicheng Yan,et al.  Semantic Object Parsing with Local-Global Long Short-Term Memory , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Ivan Laptev,et al.  Context-Aware CNNs for Person Head Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Iasonas Kokkinos,et al.  Pushing the Boundaries of Boundary Detection using Deep Learning , 2015, ICLR 2016.

[32]  Guosheng Lin,et al.  Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Shuicheng Yan,et al.  Semantic Object Parsing with Graph LSTM , 2016, ECCV.

[34]  Sebastian Nowozin,et al.  Regression Tree Fields — An efficient, non-parametric approach to image labeling problems , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Alan L. Yuille,et al.  Zoom Better to See Clearer: Human Part Segmentation with Auto Zoom Net , 2015, ArXiv.

[36]  Yizhou Yu,et al.  Deep Contrast Learning for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).