TightCap: 3D Human Shape Capture with Clothing Tightness Field

In this paper, we present TightCap, a data-driven scheme to capture both the human shape and dressed garments accurately with only a single 3D human scan, which enables numerous applications such as virtual try-on, biometrics and body evaluation. To break the severe variations of the human poses and garments, we propose to model the clothing tightness - the displacements from the garments to the human shape implicitly in the global UV texturing domain. To this end, we utilize an enhanced statistical human template and an effective multi-stage alignment scheme to map the 3D scan into a hybrid 2D geometry image. Based on this 2D representation, we propose a novel framework to predicted clothing tightness via a novel tightness formulation, as well as an effective optimization scheme to further reconstruct multi-layer human shape and garments under various clothing categories and human postures. We further propose a new clothing tightness dataset (CTD) of human scans with a large variety of clothing styles, poses and corresponding ground-truth human shapes to stimulate further research. Extensive experiments demonstrate the effectiveness of our TightCap to achieve high-quality human shape and dressed garments reconstruction, as well as the further applications for clothing segmentation, retargeting and animation.

[1]  Marc Alexa,et al.  As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[2]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Hao Li,et al.  ARCH: Animatable Reconstruction of Clothed Humans , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yaser Sheikh,et al.  Monocular Total Capture: Posing Face, Body, and Hands in the Wild , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Michael J. Black,et al.  Detailed, Accurate, Human Shape Estimation from Clothed 3D Scan Sequences , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Yaser Sheikh,et al.  Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Carlos Hernandez,et al.  Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[10]  Song-Chun Zhu,et al.  DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Jingyi Yu,et al.  SportsCap: Monocular 3D Human Motion Capture and Fine-Grained Understanding in Challenging Sports Videos , 2021, International Journal of Computer Vision.

[12]  Jingyi Yu,et al.  Few-shot Neural Human Performance Rendering from Sparse RGBD Videos , 2021, IJCAI.

[13]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[14]  Lu Fang,et al.  UnstructuredFusion: Realtime 4D Geometry and Texture Reconstruction Using Commercial RGBD Cameras , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[16]  Ligang Liu,et al.  Scanning 3D Full Human Bodies Using Kinects , 2012, IEEE Transactions on Visualization and Computer Graphics.

[17]  Alvaro Collet,et al.  High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[18]  Paolo Cignoni,et al.  Metro: Measuring Error on Simplified Surfaces , 1998, Comput. Graph. Forum.

[19]  Takeo Kanade,et al.  Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking , 2005, International Journal of Computer Vision.

[20]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Hans-Peter Seidel,et al.  Estimating body shape of dressed humans , 2009, Comput. Graph..

[22]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Alexander M. Bronstein,et al.  Deformable Shape Completion with Graph Convolutional Autoencoders , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Michael J. Black,et al.  Learning to Dress 3D People in Generative Clothing , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Tao Yu,et al.  DeepHuman: 3D Human Reconstruction From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  M. Pauly,et al.  Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[28]  Marcus A. Magnor,et al.  Video Based Reconstruction of 3D People Models , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Bharat Lal Bhatnagar,et al.  Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction , 2020, ECCV.

[30]  Xin Chen,et al.  Pose2Body: Pose-Guided Human Parts Segmentation , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[31]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[33]  Yaser Sheikh,et al.  Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Hao Li,et al.  SiCloPe: Silhouette-Based Clothed People , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Dragomir Anguelov,et al.  SCAPE: shape completion and animation of people , 2005, ACM Trans. Graph..

[36]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[37]  Peter V. Gehler,et al.  Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[39]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[40]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[42]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Iasonas Kokkinos,et al.  DensePose: Dense Human Pose Estimation in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Yaser Sheikh,et al.  Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels , 2020, NeurIPS.

[45]  C. Cobelli,et al.  A Markerless Motion Capture System to Study Musculoskeletal Biomechanics: Visual Hull and Simulated Annealing Approach , 2006, Annals of Biomedical Engineering.

[46]  Gerard Pons-Moll,et al.  360-Degree Textures of People in Clothing from a Single Image , 2019, 2019 International Conference on 3D Vision (3DV).

[47]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[48]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[49]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Steven J. Gortler,et al.  Geometry images , 2002, SIGGRAPH.

[51]  Takeo Kanade,et al.  Visual hull alignment and refinement across time: a 3D reconstruction algorithm combining shape-from-silhouette with stereo , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[52]  Christian Theobalt,et al.  MonoPerfCap , 2017, ACM Trans. Graph..

[53]  Tony Tung,et al.  SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing , 2020, ECCV.

[54]  Qionghai Dai,et al.  SimulCap : Single-View Human Performance Capture With Cloth Simulation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Michael J. Black,et al.  Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[56]  Jinlong Yang,et al.  Analyzing Clothing Layer Deformation Statistics of 3D Human Motions , 2018, ECCV.

[57]  Pushmeet Kohli,et al.  Fusion4D , 2016, ACM Trans. Graph..

[58]  Tao Yu,et al.  RobustFusion: Human Volumetric Capture with Data-Driven Visual Cues Using a RGBD Camera , 2020, ECCV.

[59]  Christian Theobalt,et al.  Multi-Garment Net: Learning to Dress 3D People From Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[60]  Vladimir G. Kim,et al.  OptCuts: joint optimization of surface cuts and parameterization , 2019, ACM Trans. Graph..

[61]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Peter V. Gehler,et al.  DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Marcus A. Magnor,et al.  Learning to Reconstruct People in Clothing From a Single RGB Camera , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Jinlong Yang,et al.  Estimation of Human Body Shape in Motion with Wide Clothing , 2016, ECCV.

[65]  Dimitrios Tzionas,et al.  Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Daniel Cremers,et al.  DeepWrinkles: Accurate and Realistic Clothing Modeling , 2018, ECCV.

[67]  Marcus A. Magnor,et al.  Tex2Shape: Detailed Full Human Body Geometry From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[68]  Hanbyul Joo,et al.  PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Kun Zhou,et al.  AutoSweep: Recovering 3D Editable Objects from a Single Photograph , 2020, IEEE Transactions on Visualization and Computer Graphics.

[70]  Michael J. Black,et al.  FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Chaitanya Patel,et al.  TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Minye Wu,et al.  ChallenCap: Monocular 3D Capture of Challenging Human Performances using Multi-Modal References , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Takeo Kanade,et al.  Panoptic Studio: A Massively Multiview System for Social Interaction Capture , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Adrian Hilton,et al.  A Layered Model of Human Body and Garment Deformation , 2014, 2014 2nd International Conference on 3D Vision.

[75]  Maxim Mikhnevich,et al.  Shape from Silhouette Under Varying Lighting and Multi-viewpoints , 2011, 2011 Canadian Conference on Computer and Robot Vision.

[76]  Michael J. Black,et al.  The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[77]  Peter V. Gehler,et al.  A Generative Model of People in Clothing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[78]  Francesc Moreno-Noguer,et al.  3DPeople: Modeling the Geometry of Dressed Humans , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[79]  Xiaowei Zhou,et al.  Learning to Estimate 3D Human Pose and Shape from a Single Color Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[80]  Qionghai Dai,et al.  DoubleFusion: Real-Time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[81]  Lan Xu,et al.  NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[82]  Jochen Lang,et al.  Estimation of human body shape and posture under clothing , 2013, Comput. Vis. Image Underst..

[83]  Tao Yu,et al.  BodyFusion: Real-Time Capture of Human Motion and Surface Geometry Using a Single Depth Camera , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).