Specular-to-Diffuse Translation for Multi-View Reconstruction

Most multi-view 3D reconstruction algorithms, especially when shape-from-shading cues are used, assume that object appearance is predominantly diffuse. To alleviate this restriction, we introduce S2Dnet, a generative adversarial network for transferring multiple views of objects with specular reflection into diffuse ones, so that multi-view reconstruction methods can be applied more effectively. Our network extends unsupervised image-to-image translation to multi-view “specular to diffuse” translation. To preserve object appearance across multiple views, we introduce a Multi-View Coherence loss (MVC) that evaluates the similarity and faithfulness of local patches after the view-transformation. In addition, we carefully design and generate a large synthetic training data set using physically-based rendering. During testing, our network takes only the raw glossy images as input, without extra information such as segmentation masks or lighting estimation. Results demonstrate that multi-view reconstruction can be significantly improved using the images filtered by our network.

[1]  Takeo Kanade,et al.  How Useful Is Photo-Realistic Rendering for Visual Learning? , 2016, ECCV Workshops.

[2]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Timo Aila,et al.  Interactive reconstruction of Monte Carlo image sequences using a recurrent denoising autoencoder , 2017, ACM Trans. Graph..

[4]  Alex Pentland,et al.  Fractal-Based Description of Natural Scenes , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Hans-Peter Seidel,et al.  LIME: Live Intrinsic Material Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Ersin Yumer,et al.  Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Vincent Lepetit,et al.  LIFT: Learned Invariant Feature Transform , 2016, ECCV.

[8]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[10]  Ye Yu,et al.  PVNN: A Neural Network Library for Photometric Vision , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[11]  Alfred M. Bruckstein,et al.  Real-Time Depth Refinement for Specular Objects , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jian Shi,et al.  Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  P. Perona,et al.  Local analysis for 3D reconstruction of specular surfaces , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yi Yang,et al.  Deep Adversarial Attention Alignment for Unsupervised Domain Adaptation: the Benefit of Target Expectation Maximization , 2018, ECCV.

[16]  Thomas Brox,et al.  Multi-view 3D Models from Single Images with a Convolutional Network , 2015, ECCV.

[17]  Matan Sela,et al.  Learning Detailed Face Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jörg Stückler,et al.  Multi-view deep learning for consistent semantic mapping with RGB-D cameras , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[20]  Mario Fritz,et al.  Deep Reflectance Maps , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Can Chen,et al.  3D reconstruction of mirror-type objects using efficient ray coding , 2016, 2016 IEEE International Conference on Computational Photography (ICCP).

[22]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[23]  Tomasz Malisiewicz,et al.  SuperPoint: Self-Supervised Interest Point Detection and Description , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[25]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Lior Wolf,et al.  One-Sided Unsupervised Domain Mapping , 2017, NIPS.

[27]  Hans-Peter Seidel,et al.  3D acquisition of mirroring objects using striped patterns , 2005, Graph. Model..

[28]  Katsushi Ikeuchi,et al.  Determining Surface Orientations of Specular Surfaces by Using the Photometric Stereo Method , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Michael Breuß,et al.  Photometric stereo for strong specular highlights , 2017, Computational Visual Media.

[30]  Chen Kong,et al.  Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction , 2017, AAAI.

[31]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[32]  Michael Goesele,et al.  Shading-Aware Multi-view Stereo , 2016, ECCV.

[33]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Matan Sela,et al.  3D Face Reconstruction by Learning from Synthetic Data , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[35]  Alfred M. Bruckstein,et al.  RGBD-fusion: Real-time high precision depth recovery , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[37]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Ersin Yumer,et al.  Material Editing Using a Physically Based Rendering Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[40]  Marc Levoy,et al.  Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[41]  Luc Van Gool,et al.  What is Around the Camera? , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Hans-Peter Seidel,et al.  Mesostructure from Specularity , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[44]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Matthias Nießner,et al.  Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Christian Theobalt,et al.  Live Intrinsic Material Estimation , 2018, CVPR 2018.

[47]  Fisher Yu,et al.  TextureGAN: Controlling Deep Image Synthesis with Texture Patches , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[49]  Kiriakos N. Kutulakos,et al.  Transparent and Specular Object Reconstruction , 2010, Comput. Graph. Forum.

[50]  Simon Lucey,et al.  Object-Centric Photometric Bundle Adjustment with Deep Shape Prior , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[51]  Luc Van Gool,et al.  DeLight-Net: Decomposing Reflectance Maps into Specular Materials and Natural Illumination , 2016, ArXiv.

[52]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Silvio Savarese,et al.  Weakly Supervised 3D Reconstruction with Adversarial Constraint , 2017, 2017 International Conference on 3D Vision (3DV).

[54]  Dacheng Tao,et al.  Perceptual Adversarial Networks for Image-to-Image Transformation , 2017, IEEE Transactions on Image Processing.

[55]  Elias Vansteenkiste,et al.  Taming Adversarial Domain Transfer with Structural Constraints for Image Enhancement , 2017, ArXiv.

[56]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[57]  Hans-Peter Seidel,et al.  Design and volume optimization of space structures , 2017, ACM Trans. Graph..

[58]  Anders P. Eriksson,et al.  Image2Mesh: A Learning Framework for Single Image 3D Reconstruction , 2017, ACCV.

[59]  Wenhan Yang,et al.  Attentive Generative Adversarial Network for Raindrop Removal from A Single Image , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Greg Humphreys,et al.  Primitives and Intersection Acceleration , 2010 .

[61]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[62]  Simon Lucey,et al.  Semantic Photometric Bundle Adjustment on Natural Sequences , 2017, ArXiv.

[63]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Greg Humphreys,et al.  Physically Based Rendering, Second Edition: From Theory To Implementation , 2010 .

[65]  Peter Hedman,et al.  Multi-view Reconstruction of Highly Specular Surfaces in Uncontrolled Environments , 2015, 2015 International Conference on 3D Vision.

[66]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[67]  Ron Kimmel,et al.  Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[68]  Le Hui,et al.  Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.