Invertible Neural Skinning

Building animatable and editable models of clothed humans from raw 3D scans and poses is a challenging problem. Existing reposing methods suffer from the limited expressiveness of Linear Blend Skinning (LBS), require costly mesh extraction to generate each new pose, and typically do not preserve surface correspondences across different poses. In this work, we introduce Invertible Neural Skinning (INS) to address these shortcomings. To maintain correspondences, we propose a Pose-conditioned Invertible Network (PIN) architecture, which extends the LBS process by learning additional pose-varying deformations. Next, we combine PIN with a differentiable LBS module to build an expressive and end-to-end Invertible Neural Skinning (INS) pipeline. We demonstrate the strong performance of our method by outperforming the state-of-the-art reposing techniques on clothed humans and preserving surface correspondences, while being an order of magnitude faster. We also perform an ablation study, which shows the usefulness of our pose-conditioning formulation, and our qualitative results display that INS can rectify artefacts introduced by LBS well. See our webpage for more details: https://yashkant.github.io/invertible-neural-skinning/

[1]  Andreas Geiger,et al.  ARAH: Animatable Volume Rendering of Articulated Human SDFs , 2022, ECCV.

[2]  Michael J. Black,et al.  Neural Point-based Shape Modeling of Humans in Challenging Clothing , 2022, 2022 International Conference on 3D Vision (3DV).

[3]  V. Lempitsky,et al.  MegaPortraits: One-shot Megapixel Neural Head Avatars , 2022, ACM Multimedia.

[4]  Yan Wang,et al.  Neural Surface Reconstruction of Dynamic Scenes with Monocular RGB-D Camera , 2022, NeurIPS.

[5]  Angjoo Kanazawa,et al.  TAVA: Template-free Animatable Volumetric Actors , 2022, ECCV.

[6]  Yu-Kun Lai,et al.  High-Fidelity Human Avatars from a Single RGB Camera , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Timur M. Bagautdinov,et al.  DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks , 2022, ECCV.

[8]  Kostas Daniilidis,et al.  CaDeX: Learning Canonical Deformation Coordinate Space for Dynamic Surface Representation via Neural Homeomorphism , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Tao Yu,et al.  Structured Local Radiance Fields for Human Avatar Modeling , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Pratul P. Srinivasan,et al.  HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Michael J. Black,et al.  gDNA: Towards Generative Detailed Neural Avatars , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  A. Vedaldi,et al.  BANMo: Building Animatable 3D Neural Models from Many Casual Videos , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Michael J. Black,et al.  ICON: Implicit Clothed humans Obtained from Normals , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Michael J. Black,et al.  I M Avatar: Implicit Morphable Head Avatars from Videos , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Tao Yu,et al.  DoubleField: Bridging the Neural Surface and Radiance Fields for High-fidelity Human Reconstruction and Rendering , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Dimitrios Tzionas,et al.  Embodied Hands: Modeling and Capturing Hands and Bodies Together , 2022, ArXiv.

[17]  Hongyi Xu,et al.  H-NeRF: Neural Radiance Fields for Rendering and Temporal Reconstruction of Humans in Motion , 2021, NeurIPS.

[18]  Hongyi Xu,et al.  imGHUM: Implicit Generative Models of 3D Human Shape and Articulated Pose , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Andreas Geiger,et al.  MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images , 2021, NeurIPS.

[20]  Michael J. Black,et al.  LEAP: Learning Articulated Occupancy of People , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Michael J. Black,et al.  SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Matthias Niessner,et al.  Dynamic Surface Function Networks for Clothed Human Bodies , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Michael J. Black,et al.  SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  S. Fidler,et al.  Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Francesc Moreno-Noguer,et al.  SMPLicit: Topology-aware Generative Model for Clothed People , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Hujun Bao,et al.  Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yang Liu,et al.  Physics informed deep learning for computational elastodynamics without labeled data , 2020, Journal of Engineering Mechanics.

[28]  Ivan Kobyzev,et al.  Normalizing Flows: An Introduction and Review of Current Methods , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Hujun Bao,et al.  Animatable Neural Radiance Fields for Human Body Modeling , 2021, ArXiv.

[30]  Michael J. Black,et al.  STAR: Sparse Trained Articulated Human Body Regressor , 2020, ECCV.

[31]  Takaaki Shiratori,et al.  DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling , 2020, ECCV.

[32]  Roberto Cipolla,et al.  Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop , 2020, ECCV.

[33]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[34]  Cristian Sminchisescu,et al.  GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Hao Li,et al.  ARCH: Animatable Reconstruction of Clothed Humans , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Hanbyul Joo,et al.  PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[38]  Geoffrey E. Hinton,et al.  NASA: Neural Articulated Shape Approximation , 2019, ECCV.

[39]  Michael J. Black,et al.  Learning to Dress 3D People in Generative Clothing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[41]  Michael J. Black,et al.  Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture From Images “In the Wild” , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Iasonas Kokkinos,et al.  HoloPose: Holistic 3D Human Reconstruction In-The-Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  Dimitrios Tzionas,et al.  Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Nikolaus F. Troje,et al.  AMASS: Archive of Motion Capture As Surface Shapes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[46]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[48]  Ullrich Köthe,et al.  Analyzing Inverse Problems with Invertible Neural Networks , 2018, ICLR.

[49]  Andrew W. Fitzgibbon,et al.  Creatures great and SMAL: Recovering the shape and motion of animals from video , 2018, ACCV.

[50]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[51]  Arnold W. M. Smeulders,et al.  i-RevNet: Deep Invertible Networks , 2018, ICLR.

[52]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[53]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[54]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[55]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[56]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[57]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[58]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[59]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[60]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[61]  Dragomir Anguelov,et al.  SCAPE: shape completion and animation of people , 2005, ACM Trans. Graph..

[62]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[63]  C. G. Broyden A Class of Methods for Solving Nonlinear Simultaneous Equations , 1965 .