论文信息 - Fully Connected Deep Structured Networks

Fully Connected Deep Structured Networks

Convolutional neural networks with many layers have recently been shown to achieve excellent results on many high-level tasks such as image classification, object detection and more recently also semantic segmentation. Particularly for semantic segmentation, a two-stage procedure is often employed. Hereby, convolutional networks are trained to provide good local pixel-wise features for the second step being traditionally a more global graphical model. In this work we unify this two-stage process into a single joint training algorithm. We demonstrate our method on the semantic image segmentation task and show encouraging results on the challenging PASCAL VOC 2012 dataset.

Raquel Urtasun | Alexander G. Schwing | A. Schwing | R. Urtasun

[1] Vladlen Koltun,et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[2] Yoshua Bengio,et al. Global training of document processing systems using graph transformer networks , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4] John S. Bridle,et al. Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.

[5] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .

[7] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[8] Vibhav Vineet,et al. Filter-Based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces , 2012, International Journal of Computer Vision.

[9] Alan L. Yuille,et al. The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[10] Thierry Artières,et al. Neural conditional random fields , 2010, AISTATS.

[11] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[12] Jian Peng,et al. Conditional Neural Fields , 2009, NIPS.

[13] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[15] Ruslan Salakhutdinov,et al. Multimodal Neural Language Models , 2014, ICML.

[16] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.

[17] Jonathan Tompson,et al. Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[18] Eric Fosler-Lussier,et al. Conditional Random Fields for Integrating Local Discriminative Classifiers , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[19] Luca Maria Gambardella,et al. Fast image scanning with deep max-pooling convolutional neural networks , 2013, 2013 IEEE International Conference on Image Processing.

[20] Vibhav Vineet,et al. Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23] S. Mallat. A wavelet tour of signal processing , 1998 .

[24] Martin J. Wainwright,et al. Tree-based reparameterization framework for analysis of sum-product and related algorithms , 2003, IEEE Trans. Inf. Theory.

[25] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Alan L. Yuille,et al. Learning Deep Structured Models , 2014, ICML.

[27] Vladlen Koltun,et al. Parameter Learning and Convergent Inference for Dense Random Fields , 2013, ICML.

[28] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[29] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[31] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[32] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[33] I JordanMichael,et al. Graphical Models, Exponential Families, and Variational Inference , 2008 .

[34] Vibhav Vineet,et al. Improved Initialization and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference , 2012, BMVC.

[35] Subhransu Maji,et al. Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[36] Eric Fosler-Lussier,et al. Backpropagation training for multilayer conditional random field based phone recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37] Richard S. Zemel,et al. High Order Regularization for Semi-Supervised Learning of Structured Output Problems , 2014, ICML.

[38] Yann LeCun,et al. Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Jian Peng,et al. A conditional neural fields model for protein threading , 2012, Bioinform..

[41] Justin Domke,et al. Structured Learning via Logistic Regression , 2013, NIPS.

[42] Yair Weiss,et al. MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies , 2007, UAI.

[43] Michael I. Jordan. Graphical Models , 1998 .