Inferring spatial layout from a single image via depth-ordered grouping

Inferring the 3D spatial layout from a single 2D image is a fundamental visual task. We formulate it as a grouping problem where edges are grouped into lines, quadrilaterals, and finally depth-ordered planes. We demonstrate that the 3D structure of planar objects in indoor scenes can be fast and accurately inferred without any learning or indexing.

[1]  Jitendra Malik,et al.  Detecting, localizing and grouping repeated scene elements from an image , 1996, ECCV.

[2]  Jonas Gårding,et al.  Direct Estimation of Shape from Texture , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  David A. Forsyth Shape from Texture without Boundaries , 2002, ECCV.

[4]  Christopher Rasmussen Texture-Based Vanishing Point Voting for Road Shape Estimation , 2004, BMVC.

[5]  Honglak Lee,et al.  A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[7]  S. P. Mudur,et al.  Three-Dimensional Computer Vision: A Geometric Viewpoint , 1995 .

[8]  Takeo Kanade,et al.  A Theory of Origami World , 1979, Artif. Intell..

[9]  Jitendra Malik,et al.  Interpreting line drawings of curved objects , 1986, International Journal of Computer Vision.

[10]  Kim L. Boyer,et al.  Perceptual organization in computer vision: a review and a proposal for a classificatory structure , 1993, IEEE Trans. Syst. Man Cybern..

[11]  Harry G. Barrow,et al.  Interpreting Line Drawings as Three-Dimensional Surfaces , 1980, Artif. Intell..

[12]  Andrew Zisserman,et al.  New Techniques for Automated Architectural Reconstruction from Photographs , 2002, ECCV.

[13]  V S Ramachandran,et al.  Perceiving shape from shading. , 1988, Scientific American.

[14]  Seth J. Teller,et al.  Automatic recovery of relative camera rotations for urban scenes , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  R. Hetherington The Perception of the Visual World , 1952 .

[17]  J. Gibson The perception of the visual world , 1951 .

[18]  Jianbo Shi,et al.  Segmentation with Pairwise Attraction and Repulsion , 2001, ICCV.

[19]  Katsushi Ikeuchi,et al.  Numerical Shape from Shading and Occluding Boundaries , 1981, Artif. Intell..

[20]  Feng Han,et al.  Bottom-up/top-down image parsing by attribute graph grammar , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Ken-ichi Anjyo,et al.  Tour into the picture: using a spidery mesh interface to make animation from a single image , 1997, SIGGRAPH.

[22]  David A. Forsyth,et al.  Shape from Texture without Boundaries , 2002, International Journal of Computer Vision.

[23]  James M. Coughlan,et al.  Manhattan World: Orientation and Outlier Detection by Bayesian Inference , 2003, Neural Computation.

[24]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Jianbo Shi,et al.  Understanding popout through repulsion , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[27]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.