Geometric reasoning for single image structure recovery

We study the problem of generating plausible interpretations of a scene from a collection of line segments automatically extracted from a single indoor image. We show that we can recognize the three dimensional structure of the interior of a building, even in the presence of occluding objects. Several physically valid structure hypotheses are proposed by geometric reasoning and verified to find the best fitting model to line segments, which is then converted to a full 3D model. Our experiments demonstrate that our structure recovery from line segments is comparable with methods using full image appearance. Our approach shows how a set of rules describing geometric constraints between groups of segments can be used to prune scene interpretation hypotheses and to generate the most plausible interpretation.

[1]  Wei Zhang,et al.  Video Compass , 2002, ECCV.

[2]  Alan L. Yuille,et al.  Manhattan World: compass direction from a single image by Bayesian inference , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  M. B. Clowes,et al.  On Seeing Things , 1971, Artif. Intell..

[4]  Honglak Lee,et al.  Automatic Single-Image 3d Reconstructions of Indoor Manhattan World Scenes , 2007, ISRR.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Alan K. Mackworth Interpreting Pictures of Polyhedral Scenes , 1973, IJCAI.

[7]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[8]  David L. Waltz,et al.  Generating Semantic Descriptions From Drawings of Scenes With Shadows , 1972 .

[9]  I. Reid,et al.  Single view metrology , 1999, ICCV 1999.

[10]  Honglak Lee,et al.  A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Adolfo Guzman,et al.  Decomposition of a visual scene into three-dimensional bodies , 1968 .

[12]  Jana Kosecka,et al.  Extraction, matching and pose recovery based on dominant rectangular structures , 2003, First IEEE International Workshop on Higher-Level Knowledge in 3D Modeling and Motion Analysis, 2003. HLK 2003..

[13]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[14]  Feng Han,et al.  Bottom-up/top-down image parsing by attribute graph grammar , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Jana Kosecka,et al.  Detection and matching of rectilinear structures , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  A. Macworth Interpreting pictures of polyhedral scenes , 1973 .

[17]  Arnold W. M. Smeulders,et al.  Depth Information by Stage Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Ashutosh Saxena,et al.  Make3D: Learning 3D Scene Structure from a Single Still Image , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Takeo Kanade,et al.  A Theory of Origami World , 1979, Artif. Intell..

[20]  Kokichi Sugihara,et al.  A Necessary and Sufficient Condition for a Picture to Represent a Polyhedral Scene , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Carsten Rother,et al.  A New Approach for Vanishing Point Detection in Architectural Environments , 2000, BMVC.

[22]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).