Deep view synthesis from sparse photometric images

The goal of light transport acquisition is to take images from a sparse set of lighting and viewing directions, and combine them to enable arbitrary relighting with changing view. While relighting from sparse images has received significant attention, there has been relatively less progress on view synthesis from a sparse set of "photometric" images---images captured under controlled conditions, lit by a single directional source; we use a spherical gantry to position the camera on a sphere surrounding the object. In this paper, we synthesize novel viewpoints across a wide range of viewing directions (covering a 60° cone) from a sparse set of just six viewing directions. While our approach relates to previous view synthesis and image-based rendering techniques, those methods are usually restricted to much smaller baselines, and are captured under environment illumination. At our baselines, input images have few correspondences and large occlusions; however we benefit from structured photometric images. Our method is based on a deep convolutional network trained to directly synthesize new views from the six input views. This network combines 3D convolutions on a plane sweep volume with a novel per-view per-depth plane attention map prediction network to effectively aggregate multi-view appearance. We train our network with a large-scale synthetic dataset of 1000 scenes with complex geometry and material properties. In practice, it is able to synthesize novel viewpoints for captured real data and reproduces complex appearance effects like occlusions, view-dependent specularities and hard shadows. Moreover, the method can also be combined with previous relighting techniques to enable changing both lighting and view, and applied to computer vision problems like multiview stereo from sparse image sets.

[1]  Xiao Li,et al.  Modeling surface appearance from a single photograph using self-augmented convolutional neural networks , 2017, ACM Trans. Graph..

[2]  Szymon Rusinkiewicz,et al.  Spacetime stereo: a unifying framework for depth from triangulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jan-Michael Frahm,et al.  Deep blending for free-viewpoint image-based rendering , 2018, ACM Trans. Graph..

[4]  Ersin Yumer,et al.  Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Christopher Schwartz,et al.  Integrated High-Quality Acquisition of Geometry and Appearance for Cultural Heritage , 2011, VAST.

[6]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[7]  Li Zhang,et al.  Soft 3D reconstruction for view synthesis , 2017, ACM Trans. Graph..

[8]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[9]  Todd E. Zickler,et al.  A coaxial optical scanner for synchronous acquisition of 3D geometry and surface reflectance , 2010, ACM Trans. Graph..

[10]  Zhe Wu,et al.  Multi-view Photometric Stereo with Spatially Varying Isotropic Materials , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Jian Wang,et al.  Reflectance Capture Using Univariate Sampling of BRDFs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[13]  David Salesin,et al.  Surface light fields for 3D photography , 2000, SIGGRAPH.

[14]  Richard Szeliski,et al.  Piecewise planar stereo for image-based rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[16]  Narendra Ahuja,et al.  DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Kalyan Sunkavalli,et al.  Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image , 2018, ECCV.

[18]  Ravi Ramamoorthi,et al.  Patch-based optimization for image-based texture mapping , 2017, ACM Trans. Graph..

[19]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[20]  Giljoo Nam,et al.  Practical SVBRDF acquisition of 3D objects with unstructured flash photography , 2018, ACM Trans. Graph..

[21]  Kalyan Sunkavalli,et al.  Deep image-based relighting from optimal sparse samples , 2018, ACM Trans. Graph..

[22]  M. Gross,et al.  Analysis of human faces using a measurement-based skin reflectance model , 2006, ACM Trans. Graph..

[23]  John Flynn,et al.  Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Pieter Peers,et al.  Recovering shape and spatially-varying surface reflectance under unknown illumination , 2016, ACM Trans. Graph..

[25]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[27]  Jannik Boll Nielsen,et al.  Minimal BRDF sampling for two-shot near-field reflectance acquisition , 2016, ACM Trans. Graph..

[28]  Katsushi Ikeuchi,et al.  Appearance Based Object Modeling using Texture Database: Acquisition Compression and Rendering , 2002, Rendering Techniques.

[29]  Lance Williams,et al.  View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[30]  George Drettakis,et al.  Depth synthesis and local warps for plausible image-based navigation , 2013, TOGS.

[31]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Robert Bregovic,et al.  Light Field Reconstruction Using Shearlet Transform , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Scott E. Reed,et al.  Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis , 2015, NIPS.

[34]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[35]  Long Quan,et al.  MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[36]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Robert J. Woodham,et al.  Photometric method for determining surface orientation from multiple images , 1980 .

[38]  Tim Weyrich,et al.  Principles of appearance acquisition and representation , 2007, SIGGRAPH '08.

[39]  John Flynn,et al.  Stereo magnification , 2018, ACM Trans. Graph..

[40]  Pieter Peers,et al.  Compressive light transport sensing , 2009, ACM Trans. Graph..

[41]  Mario Fritz,et al.  Deep Reflectance Maps , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Ning Zhang,et al.  Multi-view to Novel View: Synthesizing Novel Views With Self-learned Confidence , 2018, ECCV.

[43]  Ravi Ramamoorthi,et al.  Learning to Synthesize a 4D RGBD Light Field from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  Hans-Peter Seidel,et al.  Efficient Multi‐image Correspondences for On‐line Light Field Video Processing , 2016, Comput. Graph. Forum.

[45]  George Drettakis,et al.  Silhouette‐Aware Warping for Image‐Based Rendering , 2011, Comput. Graph. Forum.

[46]  Kalyan Sunkavalli,et al.  Learning to reconstruct shape and spatially-varying reflectance from a single image , 2018, ACM Trans. Graph..

[47]  Li Yao,et al.  Real-time virtual view synthesis using light field , 2016, EURASIP J. Image Video Process..

[48]  Ting-Chun Wang,et al.  Learning-based view synthesis for light field cameras , 2016, ACM Trans. Graph..

[49]  Vladlen Koltun,et al.  Color map optimization for 3D reconstruction with consumer depth cameras , 2014, ACM Trans. Graph..

[50]  Reinhard Klein,et al.  Advances in geometry and reflectance acquisition (course notes) , 2015, SIGGRAPH Asia Courses.

[51]  Paul E. Debevec,et al.  Acquiring the reflectance field of a human face , 2000, SIGGRAPH.

[52]  Thomas Brox,et al.  Single-view to Multi-view: Reconstructing Unseen Views with a Convolutional Network , 2015, ArXiv.

[53]  Thomas Malzbender,et al.  Polynomial texture maps , 2001, SIGGRAPH.

[54]  Anita Sellent,et al.  Floating Textures , 2008, Comput. Graph. Forum.

[55]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[56]  Yong Yu,et al.  Sparse-as-possible SVBRDF acquisition , 2016, ACM Trans. Graph..

[57]  Shenghua Gao,et al.  Deep Surface Light Fields , 2018, PACMCGIT.

[58]  Adrien Bousseau,et al.  Single-image SVBRDF capture with a rendering-aware deep network , 2018, ACM Trans. Graph..