论文信息 - Convolutional Pseudo-Prior for Structured Labeling

Convolutional Pseudo-Prior for Structured Labeling

Current practice in convolutional neural networks (CNN) remains largely bottom-up and the role of top-down process in CNN for pattern analysis and visual inference is not very clear. In this paper, we propose a new method for structured labeling by developing convolutional pseudo-prior (ConvPP) on the ground-truth labels. Our method has several interesting properties: (1) compared with classical machine learning algorithms like CRFs and Structural SVM, ConvPP automatically learns rich convolutional kernels to capture both short- and long- range contexts; (2) compared with cascade classifiers like Auto-Context, ConvPP avoids the iterative steps of learning a series of discriminative classifiers and automatically learns contextual configurations; (3) compared with recent efforts combing CNN models with CRFs and RNNs, ConvPP learns convolution in the labeling space with much improved modeling capability and less manual specification; (4) compared with Bayesian models like MRFs, ConvPP capitalizes on the rich representation power of convolution by automatically learning priors built on convolutional filters. We accomplish our task using pseudo-likelihood approximation to the prior under a novel fixed-point network structure that facilitates an end-to-end learning process. We show state-of-the-art results on sequential labeling and image labeling benchmarks.

[1] Ian D. Reid,et al. Deeply Learning the Messages in Message Passing Inference , 2015, NIPS.

[2] Vibhav Vineet,et al. Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3] Cristian Sminchisescu,et al. Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[4] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[5] Yoshua Bengio,et al. Deep Generative Stochastic Networks Trainable by Backprop , 2013, ICML.

[6] Thierry Artières,et al. Neural conditional random fields , 2010, AISTATS.

[7] E. Thompson,et al. Vision and Mind: Selected Readings in the Philosophy of Perception , 2002 .

[8] Vladlen Koltun,et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[9] Andrew McCallum,et al. Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[10] Joan Bruna,et al. Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[11] Zhipeng Luo,et al. Conditional Random Fields , 2014 .

[12] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[13] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Zhuowen Tu,et al. Auto-context and its application to high-level vision tasks , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.

[16] A. Ames. Visual perception and the rotating trapezoidal window , 1951 .

[17] Brendan J. Frey,et al. Winner-Take-All Autoencoders , 2014, NIPS.

[18] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19] Zhuowen Tu,et al. Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.

[20] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[21] Guilherme Hoefel,et al. Learning a two-stage SVM/CRF sequence classifier , 2008, CIKM '08.

[22] Shimon Ullman,et al. Combined Top-Down/Bottom-Up Segmentation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.

[24] David H. Wolpert,et al. Stacked generalization , 1992, Neural Networks.

[25] Antonio Criminisi,et al. TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[26] A. Yuille,et al. Object perception as Bayesian inference. , 2004, Annual review of psychology.

[27] Jasper Snoek,et al. Nonparametric guidance of autoencoder representations using label information , 2012, J. Mach. Learn. Res..

[28] Jian Sun,et al. Convolutional feature masking for joint object and stuff segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Honglak Lee,et al. Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[31] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[32] Jian Sun,et al. BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33] Sanja Fidler,et al. The Role of Context for Object Detection and Semantic Segmentation in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34] Antonio Torralba,et al. Nonparametric scene parsing: Label transfer via dense scene alignment , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[35] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Paul M. Thompson,et al. Brain Anatomical Structure Segmentation by Hybrid Discriminative/Generative Models , 2008, IEEE Transactions on Medical Imaging.

[37] Thorsten Joachims,et al. Training structural SVMs when exact inference is intractable , 2008, ICML '08.

[38] Max Welling,et al. Hidden-Unit Conditional Random Fields , 2011, AISTATS.

[39] Zhuowen Tu,et al. Fixed-Point Model For Structured Labeling , 2013, ICML.

[40] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[41] J. Besag. Efficiency of pseudolikelihood estimation for simple Gaussian fields , 1977 .

[42] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.