论文信息 - DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes

DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes

We consider the problem of estimating the spatial layout of an indoor scene from a monocular RGB image, modeled as the projection of a 3D cuboid. Existing solutions to this problem often rely strongly on hand-engineered features and vanishing point detection, which are prone to failure in the presence of clutter. In this paper, we present a method that uses a fully convolutional neural network (FCNN) in conjunction with a novel optimization framework for generating layout estimates. We demonstrate that our method is robust in the presence of clutter and handles a wide range of highly challenging scenes. We evaluate our method on two standard benchmarks and show that it achieves state of the art results, outperforming previous methods by a wide margin.

[1] Song-Chun Zhu,et al. Scene Parsing by Integrating Function, Geometry and Appearance Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Silvio Savarese,et al. Layout Estimation of Highly Cluttered Indoor Scenes Using Geometric and Semantic Cues , 2013, ICIAP.

[3] Svetlana Lazebnik,et al. Learning Informative Edge Maps for Indoor Scene Layout Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4] Jaishanker K. Pillai,et al. Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Alan L. Yuille,et al. The Manhattan World Assumption: Regularities in Scene Statistics which Enable Bayesian Inference , 2000, NIPS.

[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7] Sanja Fidler,et al. Box in the Box: Joint 3D Layout and Object Reasoning from Single Images , 2013, 2013 IEEE International Conference on Computer Vision.

[8] Vibhav Vineet,et al. Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9] Marc Pollefeys,et al. Efficient structured prediction for 3D indoor scene understanding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Branislav Micusík,et al. Surface Layout Estimation Using Multiple Segmentation Methods and 3D Reasoning , 2013, IbPRIA.

[11] Jitendra Malik,et al. Inferring spatial layout from a single image via depth-ordered grouping , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[12] Stephen Gould,et al. Discriminative learning with latent variables for cluttered indoor scene understanding , 2010, CACM.

[13] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[14] Kobus Barnard,et al. Understanding Bayesian Rooms Using Composite 3D Object Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[16] Stephen Gould,et al. Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding , 2010, ECCV.

[17] Derek Hoiem,et al. Recovering the spatial layout of cluttered rooms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18] Daniel Fried,et al. Bayesian geometric modeling of indoor scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[20] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21] Takeo Kanade,et al. Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces , 2010, NIPS.

[22] Sanja Fidler,et al. Rent3D: Floor-plan priors for monocular layout estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).