论文信息 - TightCap: 3D Human Shape Capture with Clothing Tightness Field

TightCap: 3D Human Shape Capture with Clothing Tightness Field

In this paper, we present TightCap, a data-driven scheme to capture both the human shape and dressed garments accurately with only a single 3D human scan, which enables numerous applications such as virtual try-on, biometrics and body evaluation. To break the severe variations of the human poses and garments, we propose to model the clothing tightness - the displacements from the garments to the human shape implicitly in the global UV texturing domain. To this end, we utilize an enhanced statistical human template and an effective multi-stage alignment scheme to map the 3D scan into a hybrid 2D geometry image. Based on this 2D representation, we propose a novel framework to predicted clothing tightness via a novel tightness formulation, as well as an effective optimization scheme to further reconstruct multi-layer human shape and garments under various clothing categories and human postures. We further propose a new clothing tightness dataset (CTD) of human scans with a large variety of clothing styles, poses and corresponding ground-truth human shapes to stimulate further research. Extensive experiments demonstrate the effectiveness of our TightCap to achieve high-quality human shape and dressed garments reconstruction, as well as the further applications for clothing segmentation, retargeting and animation.

[1] Marc Alexa,et al. As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[2] Jitendra Malik,et al. End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] Hao Li,et al. ARCH: Animatable Reconstruction of Clothed Humans , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Yaser Sheikh,et al. Monocular Total Capture: Posing Face, Body, and Hands in the Wild , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Michael J. Black,et al. Detailed, Accurate, Human Shape Estimation from Clothed 3D Scan Sequences , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Hao Li,et al. PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8] Yaser Sheikh,et al. Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Carlos Hernandez,et al. Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[10] Song-Chun Zhu,et al. DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11] Jingyi Yu,et al. SportsCap: Monocular 3D Human Motion Capture and Fine-Grained Understanding in Challenging Sports Videos , 2021, International Journal of Computer Vision.

[12] Jingyi Yu,et al. Few-shot Neural Human Performance Rendering from Sparse RGBD Videos , 2021, IJCAI.

[13] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[14] Lu Fang,et al. UnstructuredFusion: Realtime 4D Geometry and Texture Reconstruction Using Commercial RGBD Cameras , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Jan-Michael Frahm,et al. Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[16] Ligang Liu,et al. Scanning 3D Full Human Bodies Using Kinects , 2012, IEEE Transactions on Visualization and Computer Graphics.

[17] Alvaro Collet,et al. High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[18] Paolo Cignoni,et al. Metro: Measuring Error on Simplified Surfaces , 1998, Comput. Graph. Forum.

[19] Takeo Kanade,et al. Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking , 2005, International Journal of Computer Vision.

[20] Dieter Fox,et al. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Hans-Peter Seidel,et al. Estimating body shape of dressed humans , 2009, Comput. Graph..

[22] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Takeo Kanade,et al. Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24] Alexander M. Bronstein,et al. Deformable Shape Completion with Graph Convolutional Autoencoders , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Michael J. Black,et al. Learning to Dress 3D People in Generative Clothing , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Tao Yu,et al. DeepHuman: 3D Human Reconstruction From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27] M. Pauly,et al. Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[28] Marcus A. Magnor,et al. Video Based Reconstruction of 3D People Models , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Bharat Lal Bhatnagar,et al. Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction , 2020, ECCV.

[30] Xin Chen,et al. Pose2Body: Pose-Guided Human Parts Segmentation , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[31] Varun Ramakrishna,et al. Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Andrew J. Davison,et al. DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[33] Yaser Sheikh,et al. Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] Hao Li,et al. SiCloPe: Silhouette-Based Clothed People , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Dragomir Anguelov,et al. SCAPE: shape completion and animation of people , 2005, ACM Trans. Graph..

[36] Andrew W. Fitzgibbon,et al. Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[37] Peter V. Gehler,et al. Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Andrew W. Fitzgibbon,et al. KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[39] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[40] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[42] Pascal Fua,et al. On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[43] Iasonas Kokkinos,et al. DensePose: Dense Human Pose Estimation in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44] Yaser Sheikh,et al. Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels , 2020, NeurIPS.

[45] C. Cobelli,et al. A Markerless Motion Capture System to Study Musculoskeletal Biomechanics: Visual Hull and Simulated Annealing Approach , 2006, Annals of Biomedical Engineering.

[46] Gerard Pons-Moll,et al. 360-Degree Textures of People in Clothing from a Single Image , 2019, 2019 International Conference on 3D Vision (3DV).

[47] Sebastian Thrun,et al. SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[48] Michael J. Black,et al. SMPL: A Skinned Multi-Person Linear Model , 2023 .

[49] Dima Damen,et al. Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50] Steven J. Gortler,et al. Geometry images , 2002, SIGGRAPH.

[51] Takeo Kanade,et al. Visual hull alignment and refinement across time: a 3D reconstruction algorithm combining shape-from-silhouette with stereo , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[52] Christian Theobalt,et al. MonoPerfCap , 2017, ACM Trans. Graph..

[53] Tony Tung,et al. SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing , 2020, ECCV.

[54] Qionghai Dai,et al. SimulCap : Single-View Human Performance Capture With Cloth Simulation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Michael J. Black,et al. Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[56] Jinlong Yang,et al. Analyzing Clothing Layer Deformation Statistics of 3D Human Motions , 2018, ECCV.

[57] Pushmeet Kohli,et al. Fusion4D , 2016, ACM Trans. Graph..

[58] Tao Yu,et al. RobustFusion: Human Volumetric Capture with Data-Driven Visual Cues Using a RGBD Camera , 2020, ECCV.

[59] Christian Theobalt,et al. Multi-Garment Net: Learning to Dress 3D People From Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[60] Vladimir G. Kim,et al. OptCuts: joint optimization of surface cuts and parameterization , 2019, ACM Trans. Graph..

[61] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62] Peter V. Gehler,et al. DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63] Marcus A. Magnor,et al. Learning to Reconstruct People in Clothing From a Single RGB Camera , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64] Jinlong Yang,et al. Estimation of Human Body Shape in Motion with Wide Clothing , 2016, ECCV.

[65] Dimitrios Tzionas,et al. Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66] Daniel Cremers,et al. DeepWrinkles: Accurate and Realistic Clothing Modeling , 2018, ECCV.

[67] Marcus A. Magnor,et al. Tex2Shape: Detailed Full Human Body Geometry From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[68] Hanbyul Joo,et al. PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[69] Kun Zhou,et al. AutoSweep: Recovering 3D Editable Objects from a Single Photograph , 2020, IEEE Transactions on Visualization and Computer Graphics.

[70] Michael J. Black,et al. FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[71] Chaitanya Patel,et al. TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72] Minye Wu,et al. ChallenCap: Monocular 3D Capture of Challenging Human Performances using Multi-Modal References , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[73] Takeo Kanade,et al. Panoptic Studio: A Massively Multiview System for Social Interaction Capture , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74] Adrian Hilton,et al. A Layered Model of Human Body and Garment Deformation , 2014, 2014 2nd International Conference on 3D Vision.

[75] Maxim Mikhnevich,et al. Shape from Silhouette Under Varying Lighting and Multi-viewpoints , 2011, 2011 Canadian Conference on Computer and Robot Vision.

[76] Michael J. Black,et al. The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[77] Peter V. Gehler,et al. A Generative Model of People in Clothing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[78] Francesc Moreno-Noguer,et al. 3DPeople: Modeling the Geometry of Dressed Humans , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[79] Xiaowei Zhou,et al. Learning to Estimate 3D Human Pose and Shape from a Single Color Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[80] Qionghai Dai,et al. DoubleFusion: Real-Time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[81] Lan Xu,et al. NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[82] Jochen Lang,et al. Estimation of human body shape and posture under clothing , 2013, Comput. Vis. Image Underst..

[83] Tao Yu,et al. BodyFusion: Real-Time Capture of Human Motion and Surface Geometry Using a Single Depth Camera , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).