Toward Automatic 3D Generic Object Modeling from One Single Image

We present a novel method for solving the challenging problem of generating 3D models of generic object categories from just one single un-calibrated image. Our method leverages the algorithm proposed in [1] which enables a partial reconstruction of the object from a single view. A full reconstruction is achieved in a subsequent object completion stage where modified or state-of-the-art 3D shape and texture completion techniques are used to recover the complete 3D model. We present results of our method on a number of images containing objects from five generic categories (mice, staplers, mugs, cars, and bicycles). We demonstrate (numerically and qualitatively) that our method produces convincing 3D models from a single image using minimal or no human intervention. Our technique is targeted to applications where users are interested in building virtual collections of 3D models of objects, and sharing such models in virtual environments such as Google 3D Warehouse or Second Life (secondlife.com).

[1]  Roberto Cipolla,et al.  Modelling and Interpretation of Architecture from Several Images , 2004, International Journal of Computer Vision.

[2]  Marc Levoy,et al.  The digital Michelangelo project: 3D scanning of large statues , 2000, SIGGRAPH.

[3]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Ariel Shamir,et al.  Seam carving for media retargeting , 2009, CACM.

[5]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[6]  Stephen Gould,et al.  Discriminative learning with latent variables for cluttered indoor scene understanding , 2010, CACM.

[7]  Leonard McMillan,et al.  Plenoptic modeling: an image-based rendering system , 1995, SIGGRAPH.

[8]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[9]  Marc Alexa,et al.  Context-based surface completion , 2004, ACM Trans. Graph..

[10]  Leonidas J. Guibas,et al.  Example-Based 3D Scan Completion , 2005 .

[11]  Sung Yong Shin,et al.  On pixel-based texture synthesis by non-parametric sampling , 2006, Comput. Graph..

[12]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Daniel G. Aliaga,et al.  Sea of images , 2002, IEEE Visualization, 2002. VIS 2002..

[14]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, SIGGRAPH 2005.

[15]  Stephen Gould,et al.  Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding , 2010, ECCV.

[16]  Pietro Perona,et al.  3D Reconstruction by Shadow Carving: Theory and Practical Evaluation , 2007, International Journal of Computer Vision.

[17]  Marc Levoy,et al.  Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[18]  Alberto Del Bimbo,et al.  Metric 3D reconstruction and texture acquisition of surfaces of revolution from a single uncalibrated view , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[20]  Paul Debevec,et al.  Modeling and Rendering Architecture from Photographs , 1996, SIGGRAPH 1996.

[21]  Pietro Perona,et al.  Visual navigation using a single camera , 1995, Proceedings of IEEE International Conference on Computer Vision.

[22]  Andrew W. Fitzgibbon,et al.  Single View Reconstruction of Curved Surfaces , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Silvio Savarese,et al.  Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery , 2010, ECCV.

[24]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[25]  Michael Bosse,et al.  Calibrated, Registered Images of an Extended Urban Area , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26]  J. Hughes,et al.  SmoothSketch: 3D free-form shapes from complex sketches , 2006, ACM Trans. Graph..

[27]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[28]  Dani Lischinski,et al.  Deep photo: model-based photograph enhancement and viewing , 2008, SIGGRAPH 2008.

[29]  Steven M. Seitz,et al.  View morphing , 1996, SIGGRAPH.

[30]  Daniel Cohen-Or,et al.  Surface reconstruction using local shape priors , 2007, Symposium on Geometry Processing.

[31]  Frédéric Jurie,et al.  Groups of Adjacent Contour Segments for Object Detection , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[33]  Ping Tan,et al.  Symmetric architecture modeling with a single image , 2009, SIGGRAPH 2009.

[34]  Andrew Blake,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[35]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[36]  Silvio Savarese,et al.  3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[37]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[38]  Paulo R. S. Mendonça,et al.  Camera Pose Estimation and Reconstruction from Image Profiles under Circular Motion , 2000, ECCV.

[39]  Sylvain Paris,et al.  Error-Tolerant Image Compositing , 2010, ECCV.

[40]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.

[41]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[42]  Luc Van Gool,et al.  Depth-From-Recognition: Inferring Meta-data by Cognitive Feedback , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[43]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[44]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[45]  Antonio Criminisi,et al.  Creating Architectural Models from Images , 1999, Comput. Graph. Forum.

[46]  Harry Shum,et al.  Sketching reality: Realistic interpretation of architectural designs , 2008, TOGS.

[47]  Ken-ichi Anjyo,et al.  Tour into the picture: using a spidery mesh interface to make animation from a single image , 1997, SIGGRAPH.

[48]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[49]  Alla Sheffer,et al.  Template-based mesh completion , 2005, SGP '05.

[50]  Patrick Pérez,et al.  Object removal by exemplar-based inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..