Bayesian geometric modeling of indoor scenes

We propose a method for understanding the 3D geometry of indoor environments (e.g. bedrooms, kitchens) while simultaneously identifying objects in the scene (e.g. beds, couches, doors). We focus on how modeling the geometry and location of specific objects is helpful for indoor scene understanding. For example, beds are shorter than they are wide, and are more likely to be in the center of the room than cabinets, which are tall and narrow. We use a generative statistical model that integrates a camera model, an enclosing room “box”, frames (windows, doors, pictures), and objects (beds, tables, couches, cabinets), each with their own prior on size, relative dimensions, and locations. We fit the parameters of this complex, multi-dimensional statistical model using an MCMC sampling approach that combines discrete changes (e.g, adding a bed), and continuous parameter changes (e.g., making the bed larger). We find that introducing object category leads to state-of-the-art performance on room layout estimation, while also enabling recognition based only on geometry.

[1]  Joseph Schlecht,et al.  Sampling bedrooms , 2011, CVPR 2011.

[2]  T. Kanade,et al.  Geometric reasoning for single image structure recovery , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Honglak Lee,et al.  A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  David A. Forsyth,et al.  Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry , 2010, ECCV.

[5]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Joseph Schlecht,et al.  Learning models of object structure , 2009, NIPS.

[7]  Harry Shum,et al.  Image segmentation by data driven Markov chain Monte Carlo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[8]  Derek Hoiem,et al.  Recovering the spatial layout of cluttered rooms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Alexei A. Efros,et al.  Closing the loop in scene interpretation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Takeo Kanade,et al.  Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces , 2010, NIPS.

[11]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[12]  Honglak Lee,et al.  Automatic Single-Image 3d Reconstructions of Indoor Manhattan World Scenes , 2007, ISRR.

[13]  Stephen Gould,et al.  Discriminative learning with latent variables for cluttered indoor scene understanding , 2010, CACM.

[14]  Alexei A. Efros,et al.  From 3D scene geometry to human workspace , 2011, CVPR 2011.

[15]  Jitendra Malik,et al.  Inferring spatial layout from a single image via depth-ordered grouping , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.