论文信息 - Localizing 3D cuboids in single-view images

Localizing 3D cuboids in single-view images

In this paper we seek to detect rectangular cuboids and localize their corners in uncalibrated single-view images depicting everyday scenes. In contrast to recent approaches that rely on detecting vanishing points of the scene and grouping line segments to form cuboids, we build a discriminative parts-based detector that models the appearance of the cuboid corners and internal edges while enforcing consistency to a 3D cuboid model. Our model copes with different 3D viewpoints and aspect ratios and is able to detect cuboids across many different object categories. We introduce a database of images with cuboid annotations that spans a variety of indoor and outdoor scenes and show qualitative and quantitative results on our collected database. Our model out-performs baseline detectors that use 2D constraints alone on the task of localizing cuboid corners.

Jianxiong Xiao | Antonio Torralba | Bryan C. Russell | A. Torralba | Jianxiong Xiao

[1] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Krista A. Ehinger,et al. SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3] Daniel Fried,et al. Bayesian geometric modeling of indoor scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[5] Song-Chun Zhu,et al. Image Parsing with Stochastic Scene Grammar , 2011, NIPS.

[6] Yi Yang,et al. Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[7] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Jianxiong Xiao,et al. Image-based façade modeling , 2008, ACM Trans. Graph..

[9] Jack Bresenham,et al. Algorithm for computer control of a digital plotter , 1965, IBM Syst. J..

[10] Jitendra Malik,et al. Inferring spatial layout from a single image via depth-ordered grouping , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[11] Katsushi Ikeuchi,et al. Toward an assembly plan from observation. I. Task recognition with polyhedral objects , 1994, IEEE Trans. Robotics Autom..

[12] Stephen Gould,et al. Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding , 2010, ECCV.

[13] Jianxiong Xiao,et al. A Linear Approach to Matching Cuboids in RGBD Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Krista A. Ehinger,et al. Recognizing scene viewpoint using panoramic place representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Joseph L. Mundy,et al. Object Recognition in the Geometric Era: A Retrospective , 2006, Toward Category-Level Object Recognition.

[16] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18] I. Biederman. Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[19] Jianxiong Xiao,et al. Reconstructing the World's Museums , 2012, ECCV.

[20] Takeo Kanade,et al. Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces , 2010, NIPS.

[21] David A. Forsyth,et al. Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry , 2010, ECCV.

[22] Alexei A. Efros,et al. Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[23] Thorsten Joachims,et al. Cutting-plane training of structural SVMs , 2009, Machine Learning.

[24] Jianxiong Xiao,et al. Image-based street-side city modeling , 2009, ACM Trans. Graph..

[25] David A. Forsyth,et al. Recovering free space of indoor scenes from a single image , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Krista A. Ehinger,et al. Basic level scene understanding: from labels to structure and beyond , 2012, SIGGRAPH Asia Technical Briefs.

[27] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[28] Alexei A. Efros,et al. From 3D scene geometry to human workspace , 2011, CVPR 2011.

[29] Daniel P. Huttenlocher,et al. Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.