论文信息 - Differentiable Rendering: A Survey

Differentiable Rendering: A Survey

Deep neural networks (DNNs) have shown remarkable performance improvements on vision-related tasks such as object detection or image segmentation. Despite their success, they generally lack the understanding of 3D objects which form the image, as it is not always possible to collect 3D information about the scene or to easily annotate it. Differentiable rendering is a novel field which allows the gradients of 3D objects to be calculated and propagated through images. It also reduces the requirement of 3D data collection and annotation, while enabling higher success rate in various applications. This paper reviews existing literature and discusses the current state of differentiable rendering, its applications and open research problems.

[1] Henry Gouraud,et al. Computer Display of Curved Surfaces , 1971, Outstanding Dissertations in the Computer Sciences.

[2] James F. Blinn,et al. Models of light reflection for computer synthesized pictures , 1977, SIGGRAPH.

[3] J. J. Moré,et al. Levenberg--Marquardt algorithm: implementation and theory , 1977 .

[4] James T. Kajiya,et al. The rendering equation , 1986, SIGGRAPH.

[5] A. Griewank,et al. On the calculation of Jacobian matrices by the Markowitz rule , 1991 .

[6] Leonidas J. Guibas,et al. Robust Monte Carlo methods for light transport simulation , 1997 .

[7] Pat Hanrahan,et al. An efficient representation for irradiance environment maps , 2001, SIGGRAPH.

[8] David A. Maluf,et al. Dramatic Improvements to Feature Based Stereo , 2002, ECCV.

[9] Dragomir Anguelov,et al. SCAPE: shape completion and animation of people , 2005, ACM Trans. Graph..

[10] D. Blythe. The Direct3D 10 system , 2006, ACM Trans. Graph..

[11] Bernard Péroche,et al. Metropolis Instant Radiosity , 2007, Comput. Graph. Forum.

[12] Ravi Ramamoorthi,et al. A first-order analysis of lighting, shading, and shadows , 2007, TOGS.

[13] B. Segovia,et al. Coherent Metropolis Light Transport with Multiple-Try Mutations , 2007 .

[14] Barbara Cutler,et al. An intuitive daylighting performance analysis and optimization approach , 2008 .

[15] David J. Fleet,et al. Model-based hand tracking with texture, shading and self-occlusions , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[18] David K. McAllister,et al. OptiX: a general purpose ray tracing engine , 2010, ACM Trans. Graph..

[19] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[20] Michael J. Black,et al. Home 3D body scans from noisy image and range data , 2011, 2011 International Conference on Computer Vision.

[21] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22] Yuting Ye,et al. High fidelity facial animation capture and retargeting with contours , 2013, SCA '13.

[23] Kun Zhou,et al. 3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[24] Xin Tong,et al. Accurate and Robust 3D Facial Capture Using a Single RGBD Camera , 2013, 2013 IEEE International Conference on Computer Vision.

[25] M. Gross,et al. Fabricating translucent materials using continuous pigment mixtures , 2013, ACM Trans. Graph..

[26] Andrea Tagliasacchi,et al. High-contrast computational caustic design , 2014, ACM Trans. Graph..

[27] Silvio Savarese,et al. Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[28] Ingo Wald,et al. Embree: a kernel framework for efficient CPU ray tracing , 2014, ACM Trans. Graph..

[29] Michael J. Black,et al. OpenDR: An Approximate Differentiable Renderer , 2014, ECCV.

[30] Vincent Lepetit,et al. Training a Feedback Loop for Hand Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31] Michael J. Black,et al. SMPL: A Skinned Multi-Person Linear Model , 2023 .

[32] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Hans-Peter Seidel,et al. A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Steve Marschner,et al. Matching Real Fabrics with Micro-Appearance Models , 2015, ACM Trans. Graph..

[36] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[37] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[38] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[39] Max Jaderberg,et al. Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[40] Luca Bertinetto,et al. Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[41] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Peter V. Gehler,et al. Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[43] Jiajun Wu,et al. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[44] Silvio Savarese,et al. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[45] Frédo Durand,et al. Downsampling scattering parameters for rendering anisotropic media , 2016, ACM Trans. Graph..

[46] Peter V. Gehler,et al. DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Honglak Lee,et al. Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[48] Hans-Peter Seidel,et al. General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues , 2016, ECCV.

[49] Yinghao Huang,et al. Towards Accurate Marker-Less Human Shape and Pose Estimation over Time , 2017, 2017 International Conference on 3D Vision (3DV).

[50] Aaron Knoll,et al. OSPRay - A CPU Ray Tracing Framework for Scientific Visualization , 2017, IEEE Transactions on Visualization and Computer Graphics.

[51] Noah Snavely,et al. Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Hao Su,et al. A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Alexander Keller,et al. The iray light transport simulation and rendering system , 2017, SIGGRAPH Talks.

[54] Oisin Mac Aodha,et al. Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[56] Peter V. Gehler,et al. Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57] Andrea Vedaldi,et al. Learning 3D Object Categories by Looking Around Them , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[58] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59] Bodo Rosenhahn,et al. Optical Flow-Based 3D Human Motion Estimation from Monocular Video , 2017, GCPR.

[60] Bodo Rosenhahn,et al. Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs , 2017, Comput. Graph. Forum.

[61] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62] Joshua B. Tenenbaum,et al. End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.

[63] Jaakko Lehtinen,et al. Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[64] Xiaowei Zhou,et al. Learning to Estimate 3D Human Pose and Shape from a Single Color Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65] James M. Rehg,et al. 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66] RenderMan , 2018, ACM Transactions on Graphics.

[67] Jitendra Malik,et al. Learning Category-Specific Mesh Reconstruction from Image Collections , 2018, ECCV.

[68] Wei Liu,et al. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[69] Christian Theobalt,et al. MonoPerfCap , 2017, ACM Trans. Graph..

[70] Chen Kong,et al. Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction , 2017, AAAI.

[71] Alexey Dosovitskiy,et al. Unsupervised Learning of Shape and Pose with Differentiable Point Clouds , 2018, NeurIPS.

[72] Shunyu Yao,et al. 3D-Aware Scene Manipulation via Inverse Graphics , 2018, NeurIPS.

[73] William T. Freeman,et al. Unsupervised Training for 3D Morphable Model Regression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[74] Tatsuya Harada,et al. Neural 3D Mesh Renderer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[75] Yin Zhou,et al. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[76] Mathieu Aubry,et al. A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[77] Nicola Pezzotti,et al. Differentiable Image Parameterizations , 2018, Distill.

[78] Brecht Van Lommel,et al. Arnold: A Brute-Force Production Path Tracer , 2018, ACM Trans. Graph..

[79] Philippe Robert,et al. Robust Point Light Source Estimation Using Differentiable Rendering , 2018, ArXiv.

[80] Alec Jacobson,et al. Paparazzi , 2018, ACM Trans. Graph..

[81] Jitendra Malik,et al. Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[82] Serge J. Belongie,et al. Learning Single-View 3D Reconstruction with Limited Pose Supervision , 2018, ECCV.

[83] Koray Kavukcuoglu,et al. Neural scene representation and rendering , 2018, Science.

[84] Jitendra Malik,et al. End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[85] Steve Marschner,et al. Appearance capture and modeling of human teeth , 2018, ACM Trans. Graph..

[86] François Goulette,et al. Classification of Point Cloud Scenes with Multiscale Voxel Deep Network , 2018, ArXiv.

[87] Pascal Fua,et al. GarNet: A Two-Stream Network for Fast and Accurate 3D Cloth Draping , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[88] Olga Sorkine-Hornung,et al. Differentiable surface splatting for point-based geometry processing , 2019, ACM Trans. Graph..

[89] Bo Li,et al. MeshAdv: Adversarial Meshes for Visual Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[90] Zhitao Gong,et al. Strike (With) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[91] Aswin C. Sankaranarayanan,et al. Beyond Volumetric Albedo — A Surface Optimization Framework for Non-Line-Of-Sight Imaging , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[92] Hao Li,et al. Learning to Infer Implicit Surfaces without 3D Supervision , 2019, NeurIPS.

[93] Thomas Brox,et al. FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape From Single RGB Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[94] Simon Kornblith,et al. Cerberus: A Multi-headed Derenderer , 2019, ArXiv.

[95] Sebastian Nowozin,et al. Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[96] Hao Li,et al. PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[97] Jaakko Lehtinen,et al. Sample-based Monte Carlo denoising using a kernel-splatting network , 2019, ACM Trans. Graph..

[98] Xu Zhou,et al. STA: Adversarial Attacks on Siamese Trackers , 2019, ArXiv.

[99] Qiang Li,et al. End-to-End Hand Mesh Recovery From a Monocular RGB Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[100] David Meger,et al. GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects , 2019, ICML.

[101] Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer , 2019, NeurIPS.

[102] Ronald Fedkiw,et al. High-Quality Face Capture Using Anatomical Muscles , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[103] Dimitrios Tzionas,et al. Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[104] Chenxi Liu,et al. Adversarial Attacks Beyond the Image Space , 2017, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[105] Chun-Liang Li,et al. Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer , 2018, ICLR.

[106] Jiawei Zhang,et al. GResNet: Graph Residual Network for Reviving Deep GNNs from Suspended Animation , 2019, ArXiv.

[107] Vittorio Ferrari,et al. Learning Single-Image 3D Reconstruction by Generative Modelling of Shape, Pose and Shading , 2019, International Journal of Computer Vision.

[108] Jiawei Zhang. GRESNET: Graph Residuals for Reviving Deep Graph Neural Nets from Suspended Animation , 2019 .

[109] Jaakko Lehtinen,et al. Deep convolutional reconstruction for gradient-domain rendering , 2019, ACM Trans. Graph..

[110] Hao Li,et al. Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[111] Yong-Liang Yang,et al. HoloGAN: Unsupervised Learning of 3D Representations From Natural Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[112] Markus Eberl. Maya , 2019, The Americas.

[113] Thomas Müller,et al. Neural Importance Sampling , 2018, ACM Trans. Graph..

[114] Daniel Cohen-Or,et al. Pix2Vex: Image-to-Geometry Reconstruction using a Smooth Differentiable Renderer , 2019, ArXiv.

[115] Jitendra Malik,et al. Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[116] Ravi Ramamoorthi,et al. A differential theory of radiative transfer , 2019, ACM Trans. Graph..

[117] Jianfei Cai,et al. 3D Hand Shape and Pose Estimation from a Single RGB Image (Supplementary Material) , 2019 .

[118] Gordon Wetzstein,et al. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[119] Justus Thies,et al. Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[120] Hao Li,et al. SiCloPe: Silhouette-Based Clothed People , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[121] Hao Zhang,et al. Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[122] Sanja Fidler,et al. Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research , 2019, ArXiv.

[123] Leonidas J. Guibas,et al. Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[124] Tae-Kyun Kim,et al. Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[125] N. Mitra,et al. Escaping Plato’s Cave: 3D Shape From Adversarial Rendering , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[126] Matthias Nießner,et al. Inverse Path Tracing for Joint Material and Lighting Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[127] Jiancheng Liu,et al. ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[128] Richard A. Newcombe,et al. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[129] Tatsuya Harada,et al. Learning View Priors for Single-View 3D Reconstruction , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[130] Andreas Geiger,et al. Texture Fields: Learning Texture Representations in Function Space , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[131] Wenzel Jakob,et al. Reparameterizing discontinuous integrands for differentiable rendering , 2019, ACM Trans. Graph..

[132] Siyu Zhu,et al. End-to-End Learning Local Multi-View Descriptors for 3D Point Clouds , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[133] Richard Szeliski,et al. SynSin: End-to-End View Synthesis From a Single Image , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[134] Adrien Gaidon,et al. Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[135] Michael Arens,et al. Learning and Tracking the 3D Body Shape of Freely Moving Infants from RGB-D sequences , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[136] Ronen Basri,et al. Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[137] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.

[138] Cheng Zhang,et al. Path-space differentiable rendering , 2020, ACM Trans. Graph..

[139] Jan Kautz,et al. Self-supervised Single-view 3D Reconstruction via Semantic Consistency , 2020, ECCV.

[140] A. Vedaldi,et al. Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild , 2019, Computer Vision and Pattern Recognition.

[141] Yaron Lipman,et al. Universal Differentiable Renderer for Implicit Neural Representations , 2020, ArXiv.

[142] Yinda Zhang,et al. DIST: Rendering Deep Implicit Signed Distance Function With Differentiable Sphere Tracing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[143] Merlin Nimier-David. Radiative Backpropagation: An Adjoint Method for Lightning-Fast Di erentiable Rendering , 2020 .

[144] Christoph H. Lampert,et al. Leveraging 2D Data to Learn Textured 3D Mesh Generation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[145] Andreas Geiger,et al. Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[146] Gordon Wetzstein,et al. State of the Art on Neural Rendering , 2020, Comput. Graph. Forum.

[147] Matthias Zwicker,et al. SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[148] Yinda Zhang,et al. Pixel2Mesh: 3D Mesh Model Generation via Image Guided Deformation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[149] Dimitrios Tzionas,et al. Embodied Hands: Modeling and Capturing Hands and Bodies Together , 2022, ArXiv.