RenderNet: A deep convolutional network for differentiable rendering from 3D shapes

Traditional computer graphics rendering pipeline is designed for procedurally generating 2D quality images from 3D shapes with high performance. The non-differentiability due to discrete operations such as visibility computation makes it hard to explicitly correlate rendering parameters and the resulting image, posing a significant challenge for inverse rendering tasks. Recent work on differentiable rendering achieves differentiability either by designing surrogate gradients for non-differentiable operations or via an approximate but differentiable renderer. These methods, however, are still limited when it comes to handling occlusion, and restricted to particular rendering effects. We present RenderNet, a differentiable rendering convolutional network with a novel projection unit that can render 2D images from 3D shapes. Spatial occlusion and shading calculation are automatically encoded in the network. Our experiments show that RenderNet can successfully learn to implement different shaders, and can be used in inverse rendering tasks to estimate shape, pose, lighting and texture from a single image.

[1]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.

[2]  Gavin S. P. Miller,et al.  Efficient algorithms for local and global accessibility shading , 1994, SIGGRAPH.

[3]  Greg Turk,et al.  Simplification and Repair of Polygonal Models Using Volumetric Techniques , 2003, IEEE Trans. Vis. Comput. Graph..

[4]  Adam Finkelstein,et al.  Suggestive contours for conveying shape , 2003, ACM Trans. Graph..

[5]  K. Hohn,et al.  Determining Lightness from an Image , 2004 .

[6]  Holger Winnemöller,et al.  Real-time video abstraction , 2006, SIGGRAPH 2006.

[7]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[8]  Leonidas J. Guibas,et al.  3D-Assisted Image Feature Synthesis for Novel Views of an Object , 2014, ArXiv.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[11]  Alexei A. Efros,et al.  Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michael J. Black,et al.  OpenDR: An Approximate Differentiable Renderer , 2014, ECCV.

[13]  Scott E. Reed,et al.  Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis , 2015, NIPS.

[14]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[15]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[16]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Max Jaderberg,et al.  Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[21]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[22]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[23]  Thomas Brox,et al.  Learning to Generate Chairs, Tables and Cars with Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Hans-Peter Seidel,et al.  Deep Shading: Convolutional Neural Networks for Screen Space Shading , 2016, Comput. Graph. Forum.

[25]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Mario Fritz,et al.  Novel Views of Objects from a Single Image , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Matan Sela,et al.  Learning Detailed Face Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[31]  Ersin Yumer,et al.  Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Simon Lucey,et al.  Rethinking Reprojection: Closing the Loop for Pose-Aware Shape Reconstruction from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[34]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Silvio Savarese,et al.  Weakly Supervised 3D Reconstruction with Adversarial Constraint , 2017, 2017 International Conference on 3D Vision (3DV).

[36]  Takanori Maehara,et al.  Neural Inverse Rendering for General Reflectance Photometric Stereo , 2018, ICML.

[37]  James M. Rehg,et al.  3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  William T. Freeman,et al.  Unsupervised Training for 3D Morphable Model Regression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Tatsuya Harada,et al.  Neural 3D Mesh Renderer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Vittorio Ferrari,et al.  Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision , 2018, BMVC.

[41]  Jitendra Malik,et al.  Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Jitendra Malik,et al.  Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).