论文信息 - Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering. Enabling ML models to understand image formation might be key for generalization. However, due to an essential rasterization step involving discrete assignment operations, rendering pipelines are non-differentiable and thus largely inaccessible to gradient-based ML techniques. In this paper, we present {\emph DIB-R}, a differentiable rendering framework which allows gradients to be analytically computed for all pixels in an image. Key to our approach is to view foreground rasterization as a weighted interpolation of local properties and background rasterization as a distance-based aggregation of global geometry. Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models. We showcase our approach in two ML applications: single-image 3D object prediction, and 3D textured object generation, both trained using exclusively using 2D supervision. Our project website is: this https URL

[1] Jaakko Lehtinen,et al. Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[2] Pietro Perona,et al. Caltech-UCSD Birds 200 , 2010 .

[3] Shunyu Yao,et al. 3D-Aware Scene Manipulation via Inverse Graphics , 2018, NeurIPS.

[4] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[5] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[6] William T. Freeman,et al. Unsupervised Training for 3D Morphable Model Regression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] David Meger,et al. Improved Adversarial Systems for 3D Object Generation and Reconstruction , 2017, CoRL.

[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9] Wei Liu,et al. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[10] Tatsuya Harada,et al. Neural 3D Mesh Renderer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Frank D. Luna,et al. Introduction to 3D Game Programming with DirectX 11 , 2008 .

[12] Tom Davis,et al. Opengl programming guide: the official guide to learning opengl , 1993 .

[13] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Attila Szabó,et al. Unsupervised 3D Shape Learning from Image Collections in the Wild , 2018, ArXiv.

[15] Jiajun Wu,et al. Learning Shape Priors for Single-View 3D Completion and Reconstruction , 2018, ECCV.

[16] Alec Jacobson,et al. Paparazzi , 2018, ACM Trans. Graph..

[17] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[18] David Meger,et al. GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects , 2019, ICML.

[19] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[20] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[21] Leonidas J. Guibas,et al. Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[22] Silvio Savarese,et al. Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[23] Jitendra Malik,et al. Learning Category-Specific Mesh Reconstruction from Image Collections , 2018, ECCV.

[24] Bo Yang,et al. 3D Object Reconstruction from a Single Depth View with Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[25] Daniel Cohen-Or,et al. Pix2Vex: Image-to-Geometry Reconstruction using a Smooth Differentiable Renderer , 2019, ArXiv.

[26] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27] Alexey Dosovitskiy,et al. Unsupervised Learning of Shape and Pose with Differentiable Point Clouds , 2018, NeurIPS.

[28] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[29] Gavin S. P. Miller,et al. Hierarchical Z-buffer visibility , 1993, SIGGRAPH.

[30] Chun-Liang Li,et al. Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer , 2018, ICLR.

[31] Thomas Brox,et al. What Do Single-View 3D Reconstruction Networks Learn? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Hao Li,et al. Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[34] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Vittorio Ferrari,et al. Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision , 2018, BMVC.

[36] Pat Hanrahan,et al. An efficient representation for irradiance environment maps , 2001, SIGGRAPH.

[37] Michael J. Black,et al. OpenDR: An Approximate Differentiable Renderer , 2014, ECCV.

[38] Hao Li,et al. Soft Rasterizer: Differentiable Rendering for Unsupervised Single-View Mesh Reconstruction , 2019, ArXiv.

[39] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[40] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[42] Bui Tuong Phong. Illumination for computer generated pictures , 1975, Commun. ACM.

[43] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[44] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.