SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes

Neural implicit surface representations have emerged as a promising paradigm to capture 3D shapes in a continuous and resolution-independent manner. However, adapting them to articulated shapes is non-trivial. Existing approaches learn a backward warp field that maps deformed to canonical points. However, this is problematic since the backward warp field is pose dependent and thus requires large amounts of data to learn. To address this, we introduce SNARF, which combines the advantages of linear blend skinning (LBS) for polygonal meshes with those of neural implicit surfaces by learning a forward deformation field without direct supervision. This deformation field is defined in canonical, pose-independent, space, enabling generalization to unseen poses. Learning the deformation field from posed meshes alone is challenging since the correspondences of deformed points are defined implicitly and may not be unique under changes of topology. We propose a forward skinning model that finds all canonical correspondences of any deformed point using iterative root finding. We derive analytical gradients via implicit differentiation, enabling end-to-end training from 3D meshes with bone transformations. Compared to state-of-the-art neural implicit representations, our approach generalizes better to unseen poses while preserving accuracy. We demonstrate our method in challenging scenarios on (clothed) 3D humans in diverse and unseen poses.

[1]  J. P. Lewis,et al.  Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation , 2023 .

[2]  Michael J. Black,et al.  The Power of Points for Modeling Humans in Clothing , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Andreas Geiger,et al.  MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images , 2021, NeurIPS.

[4]  Christian Theobalt,et al.  Neural actor , 2021, ACM Trans. Graph..

[5]  C. Theobalt,et al.  Real-time deep dynamic characters , 2021, ACM Trans. Graph..

[6]  Andreas Geiger,et al.  Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Michael J. Black,et al.  SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Michael J. Black,et al.  LEAP: Learning Articulated Occupancy of People , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Matthias Niessner,et al.  Dynamic Surface Function Networks for Clothed Human Bodies , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Michael J. Black,et al.  SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Francesc Moreno-Noguer,et al.  SMPLicit: Topology-aware Generative Model for Clothed People , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Hujun Bao,et al.  Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Carsten Stoll,et al.  ANR: Articulated Neural Rendering for Virtual Avatars , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ira Kemelmacher-Shlizerman,et al.  Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild , 2020, ArXiv.

[15]  Francesc Moreno-Noguer,et al.  D-NeRF: Neural Radiance Fields for Dynamic Scenes , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Bharat Lal Bhatnagar,et al.  LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration , 2020, NeurIPS.

[17]  Michael J. Black,et al.  STAR: Sparse Trained Articulated Human Body Regressor , 2020, ECCV.

[18]  Bharat Lal Bhatnagar,et al.  Unsupervised Shape and Pose Disentanglement for 3D Meshes , 2020, ECCV.

[19]  Stefano Soatto,et al.  Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction , 2020, NeurIPS.

[20]  Cristian Sminchisescu,et al.  GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  E. Kalogerakis,et al.  RigNet , 2020, ACM Trans. Graph..

[22]  Hao Li,et al.  ARCH: Animatable Reconstruction of Clothed Humans , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Alec Jacobson,et al.  NiLBS: Neural Inverse Linear Blend Skinning , 2020, ArXiv.

[24]  Tao Yu,et al.  Robust 3D Self-Portraits in Seconds , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Hanbyul Joo,et al.  PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ronen Basri,et al.  Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[27]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[28]  Marc Pollefeys,et al.  Convolutional Occupancy Networks , 2020, ECCV.

[29]  Gerard Pons-Moll,et al.  Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Y. Lipman,et al.  Implicit Geometric Regularization for Learning Shapes , 2020, ICML.

[31]  Andreas Geiger,et al.  Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Y. Lipman,et al.  SAL: Sign Agnostic Learning of Shapes From Raw Data , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Andreas Geiger,et al.  Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Michael J. Black,et al.  Learning to Dress 3D People in Generative Clothing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Kun Zhou,et al.  NeuroSkinning: automatic skin binding for production characters with deep graph networks , 2019, ACM Trans. Graph..

[36]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[37]  Victor Lempitsky,et al.  Textured Neural Avatars , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Andreas Geiger,et al.  Texture Fields: Learning Texture Representations in Function Space , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Jianmin Zheng,et al.  Disentangled Human Body Embedding Based on Deep Hierarchical Neural Network , 2019, IEEE Transactions on Visualization and Computer Graphics.

[40]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  C. Theobalt,et al.  Tex2Shape: Detailed Full Human Body Geometry From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Nikolaus F. Troje,et al.  AMASS: Archive of Motion Capture As Surface Shapes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Tao Yu,et al.  DeepHuman: 3D Human Reconstruction From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  Anders P. Eriksson,et al.  Deep Level Sets: Implicit Surface Representations for 3D Shape Inference , 2019, ArXiv.

[45]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Marcus A. Magnor,et al.  Detailed Human Avatars from Monocular Video , 2018, 2018 International Conference on 3D Vision (3DV).

[49]  Michael J. Black,et al.  Dynamic FAUST: Registering Human Bodies in Motion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[51]  Michael J. Black,et al.  Pose-conditioned joint angle limits for 3D human pose reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Hans-Peter Seidel,et al.  Learning skeletons for shape and pose , 2010, I3D '10.

[53]  Jirí Zára,et al.  Geometric skinning with approximate dual quaternion blending , 2008, TOGS.

[54]  James E. Gain,et al.  Animation space: A truly linear framework for character animation , 2006, TOGS.

[55]  Dragomir Anguelov,et al.  SCAPE: shape completion and animation of people , 2005, ACM Trans. Graph..

[56]  Doug L. James,et al.  Skinning mesh animations , 2005, ACM Trans. Graph..

[57]  Cary B. Phillips,et al.  Multi-weight enveloping: least-squares approximation techniques for skin animation , 2002, SCA '02.

[58]  John P. Lewis,et al.  Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation , 2000, SIGGRAPH.

[59]  C. G. Broyden A Class of Methods for Solving Nonlinear Simultaneous Equations , 1965 .

[60]  Hujun Bao,et al.  Animatable Neural Radiance Fields for Human Body Modeling , 2021, ArXiv.

[61]  A. Ricci,et al.  A Constructive Geometry for Computer Graphics , 1973, Comput. J..