论文信息 - Gaussian Conditional Random Field Network for Semantic Segmentation

Gaussian Conditional Random Field Network for Semantic Segmentation

In contrast to the existing approaches that use discrete Conditional Random Field (CRF) models, we propose to use a Gaussian CRF model for the task of semantic segmentation. We propose a novel deep network, which we refer to as Gaussian Mean Field (GMF) network, whose layers perform mean field inference over a Gaussian CRF. The proposed GMF network has the desired property that each of its layers produces an output that is closer to the maximum a posteriori solution of the Gaussian CRF compared to its input. By combining the proposed GMF network with deep Convolutional Neural Networks (CNNs), we propose a new end-to-end trainable Gaussian conditional random field network. The proposed Gaussian CRF network is composed of three sub-networks: (i) a CNN-based unary network for generating unary potentials, (ii) a CNN-based pairwise network for generating pairwise potentials, and (iii) a GMF network for performing Gaussian CRF inference. When trained end-to-end in a discriminative fashion, and evaluated on the challenging PASCALVOC 2012 segmentation dataset, the proposed Gaussian CRF network outperforms various recent semantic segmentation approaches that combine CNNs with discrete CRF models.

[1] Jonathan Tompson,et al. Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[2] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[3] Sebastian Nowozin,et al. Loss-Specific Training of Non-Parametric Image Restoration Models: A New State of the Art , 2012, ECCV.

[4] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[5] Yair Weiss,et al. From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.

[6] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[7] Andrew Zisserman,et al. Deep Structured Output Learning for Unconstrained Text Recognition , 2014, ICLR.

[8] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9] Ian D. Reid,et al. Deeply Learning the Messages in Message Passing Inference , 2015, NIPS.

[10] Noah Snavely,et al. Material recognition in the wild with the Materials in Context Database , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Peter Robinson,et al. Continuous Conditional Neural Fields for Structured Regression , 2014, ECCV.

[12] Thierry Artières,et al. Neural conditional random fields , 2010, AISTATS.

[13] Jian Peng,et al. Conditional Neural Fields , 2009, NIPS.

[14] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Yann LeCun,et al. Learning Fast Approximations of Sparse Coding , 2010, ICML.

[16] Raquel Urtasun,et al. Fully Connected Deep Structured Networks , 2015, ArXiv.

[17] Jitendra Malik,et al. Simultaneous Detection and Segmentation , 2014, ECCV.

[18] Marshall F. Tappen,et al. The Logistic Random Field — A convenient graphical model for learning parameters for MRF-based labeling , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Vladlen Koltun,et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[20] Edward H. Adelson,et al. Learning Gaussian Conditional Random Fields for Low-Level Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21] David W. Jacobs,et al. Deep hierarchical parsing for semantic segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Sebastian Nowozin,et al. Cascades of Regression Tree Fields for Image Restoration , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Ronan Collobert,et al. From image-level to pixel-level labeling with Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Ashutosh Saxena,et al. 3-D Depth Reconstruction from a Single Still Image , 2007, International Journal of Computer Vision.

[25] Guosheng Lin,et al. Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Michael J. Black,et al. Fields of Experts , 2009, International Journal of Computer Vision.

[27] Zoran Obradovic,et al. Neural Gaussian Conditional Random Fields , 2014, ECML/PKDD.

[28] Jitendra Malik,et al. Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] George Papandreou,et al. Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation , 2015, ArXiv.

[30] Vladlen Koltun,et al. Parameter Learning and Convergent Inference for Dense Random Fields , 2013, ICML.

[31] Renjie Liao,et al. Semantic Segmentation with Object Clique Potential , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32] Martial Hebert,et al. Learning message-passing inference machines for structured prediction , 2011, CVPR 2011.

[33] Jonathan Le Roux,et al. Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures , 2014, ArXiv.

[34] Vibhav Vineet,et al. Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[36] Camille Couprie,et al. Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] Alan L. Yuille,et al. Learning Deep Structured Models , 2014, ICML.

[38] Jian Sun,et al. BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39] Rama Chellappa,et al. Classification of textures using Gaussian Markov random fields , 1985, IEEE Trans. Acoust. Speech Signal Process..

[40] Veselin Stoyanov,et al. Empirical Risk Minimization of Graphical Model Parameters Given Approximate Inference, Decoding, and Model Structure , 2011, AISTATS.

[41] Justin Domke,et al. Learning Graphical Model Parameters with Approximate Marginal Inference , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42] Subhransu Maji,et al. Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[43] Rama Chellappa,et al. Mean field annealing using compound Gauss-Markov random fields for edge detection and image estimation , 1993, IEEE Trans. Neural Networks.

[44] Ronan Collobert,et al. Recurrent Convolutional Neural Networks for Scene Labeling , 2014, ICML.

[45] Ming-Yu Liu,et al. Recursive Context Propagation Network for Semantic Scene Labeling , 2014, NIPS.

[46] Seunghoon Hong,et al. Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation , 2015, NIPS.

[47] 한보형,et al. Learning Deconvolution Network for Semantic Segmentation , 2015 .

[48] Stefan Roth,et al. Shrinkage Fields for Effective Image Restoration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[49] Yoshua Bengio,et al. Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models , 1993, NIPS.

[50] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[51] Adrian Barbu,et al. Training an Active Random Field for Real-Time Image Denoising , 2009, IEEE Transactions on Image Processing.

[52] Guosheng Lin,et al. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Gregory Shakhnarovich,et al. Feedforward semantic segmentation with zoom-out features , 2014, CVPR.

[54] Jian Sun,et al. Convolutional feature masking for joint object and stuff segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Leonhard Held,et al. Gaussian Markov Random Fields: Theory and Applications , 2005 .