Image-Based Synthesis and Re-synthesis of Viewpoints Guided by 3D Models

We propose a technique to use the structural information extracted from a set of 3D models of an object class to improve novel-view synthesis for images showing unknown instances of this class. These novel views can be used to "amplify" training image collections that typically contain only a low number of views or lack certain classes of views entirely (e. g. top views). We extract the correlation of position, normal, re- flectance and appearance from computer-generated images of a few exemplars and use this information to infer new appearance for new instances. We show that our approach can improve performance of state-of-the-art detectors using real-world training data. Additional applications include guided versions of inpainting, 2D-to-3D conversion, super- resolution and non-local smoothing.

[1]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[2]  Tal Hassner,et al.  Viewing Real-World Faces in 3D , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Andrew Zisserman,et al.  Texture classification with minimal training images , 2008, 2008 19th International Conference on Pattern Recognition.

[4]  Derek Hoiem,et al.  Diagnosing Error in Object Detectors , 2012, ECCV.

[5]  David Salesin,et al.  Surface light fields for 3D photography , 2000, SIGGRAPH.

[6]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[7]  Markus Vincze,et al.  3D object classification for mobile robots in home-environments using web-data , 2010, 19th International Workshop on Robotics in Alpe-Adria-Danube Region (RAAD 2010).

[8]  BeierThaddeus,et al.  Feature-based image metamorphosis , 1992 .

[9]  Bernt Schiele,et al.  Learning people detection models from few training samples , 2011, CVPR 2011.

[10]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[11]  Eli Shechtman,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, ACM Trans. Graph..

[12]  Dieter Schmalstieg,et al.  Softshell , 2012, ACM Transactions on Graphics.

[13]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Patrick Pérez,et al.  Object removal by exemplar-based inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Peter-Pike J. Sloan,et al.  The Lit Sphere: A Model for Capturing NPR Shading from Art , 2001, Graphics Interface.

[16]  Ken-ichi Anjyo,et al.  Tour into the picture: using a spidery mesh interface to make animation from a single image , 1997, SIGGRAPH.

[17]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, SIGGRAPH 2007.

[18]  Wenze Hu,et al.  Learning 3D object templates by hierarchical quantization of geometry and appearance spaces , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[20]  Antonio Torralba,et al.  Evaluation of image features using a photorealistic virtual world , 2011, 2011 International Conference on Computer Vision.

[21]  Dariu Gavrila,et al.  A mixed generative-discriminative framework for pedestrian classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  David Salesin,et al.  Image Analogies , 2001, SIGGRAPH.

[23]  Lance Williams,et al.  View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[24]  Mario Fritz,et al.  Recognizing Materials from Virtual Examples , 2012, ECCV.

[25]  Philip H. S. Torr,et al.  VideoTrace: rapid interactive scene modelling from video , 2007, ACM Trans. Graph..

[26]  Neil Hunt,et al.  The triangle processor and normal vector shader: a VLSI system for high performance graphics , 1988, SIGGRAPH.

[27]  Dieter Fox,et al.  3D laser scan classification using web data and domain adaptation , 2009, Robotics: Science and Systems.

[28]  Bernard Mendiburu,et al.  3D Movie Making: Stereoscopic Digital Cinema from Script to Screen , 2009 .

[29]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Leonidas J. Guibas,et al.  Example-Based 3D Scan Completion , 2005 .

[31]  Ian D. Reid,et al.  Simultaneous Monocular 2D Segmentation, 3D Pose Recovery and 3D Reconstruction , 2012, ACCV.

[32]  Thaddeus Beier,et al.  Feature-based image metamorphosis , 1998 .

[33]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[34]  Yizhou Yu,et al.  Efficient View-Dependent Image-Based Rendering with Projective Texture-Mapping , 1998, Rendering Techniques.

[35]  Donald P. Greenberg,et al.  Toward a psychophysically-based light reflection model for image synthesis , 2000, SIGGRAPH.

[36]  Bernt Schiele,et al.  In Good Shape: Robust People Detection based on Appearance and Shape , 2011, BMVC.

[37]  Hans-Peter Seidel,et al.  Material memex , 2012, ACM Trans. Graph..

[38]  Mario Fritz,et al.  Object Tracking and Pose Estimation Using Light-Field Object Models , 2002, VMV.

[39]  P. Fua,et al.  Pose estimation for category specific multiview object localization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Michael Goesele,et al.  Back to the Future: Learning Shape Models from 3D CAD Data , 2010, BMVC.

[41]  Cordelia Schmid,et al.  Multi-view object class detection with a 3D geometric model , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  DeeringMichael,et al.  The triangle processor and normal vector shader , 1988 .

[43]  Peter V. Gehler,et al.  Teaching 3D geometry to deformable part models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.