Monocular 3D Body Shape Reconstruction under Clothing

Estimating the 3D shape of objects from monocular images is a well-established and challenging task in the computer vision field. Further challenges arise when highly deformable objects, such as human faces or bodies, are considered. In this work, we address the problem of estimating the 3D shape of a human body from single images. In particular, we provide a solution to the problem of estimating the shape of the body when the subject is wearing clothes. This is a highly challenging scenario as loose clothes might hide the underlying body shape to a large extent. To this aim, we make use of a parametric 3D body model, the SMPL, whose parameters describe the body pose and shape of the body. Our main intuition is that the shape parameters associated with an individual should not change whether the subject is wearing clothes or not. To improve the shape estimation under clothing, we train a deep convolutional network to regress the shape parameters from a single image of a person. To increase the robustness to clothing, we build our training dataset by associating the shape parameters of a “minimally clothed” person to other samples of the same person wearing looser clothes. Experimental validation shows that our approach can more accurately estimate body shape parameters with respect to state-of-the-art approaches, even in the case of loose clothes.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Adrian Munteanu,et al.  Learning to Estimate the Body Shape Under Clothing From a Single 3-D Scan , 2021, IEEE Transactions on Industrial Informatics.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Jochen Lang,et al.  Estimation of human body shape and posture under clothing , 2013, Comput. Vis. Image Underst..

[6]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[7]  Peter V. Gehler,et al.  Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Cordelia Schmid,et al.  BodyNet: Volumetric Inference of 3D Human Body Shapes , 2018, ECCV.

[9]  Noboru Murata,et al.  CAFM: A 3D Morphable Model for Animals , 2020, 2020 IEEE Winter Applications of Computer Vision Workshops (WACVW).

[10]  Alberto Del Bimbo,et al.  Dictionary Learning Based 3D Morphable Model Construction for Face Recognition with Varying Expression and Pose , 2015, 2015 International Conference on 3D Vision.

[11]  Ersin Yumer,et al.  Self-supervised Learning of Motion Capture , 2017, NIPS.

[12]  Wanli Ouyang,et al.  3D Human Mesh Regression With Dense Correspondence , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Michael J. Black,et al.  STAR: Sparse Trained Articulated Human Body Regressor , 2020, ECCV.

[14]  Dimitrios Tzionas,et al.  Learning to Train with Synthetic Humans , 2019, GCPR.

[15]  Hao Li,et al.  ARCH: Animatable Reconstruction of Clothed Humans , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Alberto Del Bimbo,et al.  Effective 3D based frontalization for unconstrained face recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[17]  Tal Hassner,et al.  Rapid Synthesis of Massive Face Sets for Improved Face Recognition , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[18]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[19]  Alberto Del Bimbo,et al.  Rendering Realistic Subject-Dependent Expression Images by Learning 3DMM Deformation Coefficients , 2018, ECCV Workshops.

[20]  Gerard Pons-Moll,et al.  Learning to Transfer Texture From Clothing Images to 3D Humans , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Cristian Sminchisescu,et al.  GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Sergio Escalera,et al.  SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery , 2020, Pattern Recognit..

[23]  Yu Shen,et al.  GAN-Based Garment Generation Using Sewing Pattern Images , 2020, ECCV.

[24]  Michael J. Black,et al.  Learning to Dress 3D People in Generative Clothing , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Prerana Mukherjee,et al.  AnimePose: Multi-person 3D pose estimation and animation , 2020, Pattern Recognit. Lett..

[26]  Iasonas Kokkinos,et al.  DensePose: Dense Human Pose Estimation in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  David Picard,et al.  Multi-task Deep Learning for Real-Time 3D Human Pose Estimation and Action Recognition , 2020, IEEE transactions on pattern analysis and machine intelligence.

[28]  Thomas Gerig,et al.  Gaussian Process Morphable Models , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Peter V. Gehler,et al.  Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation , 2018, 2018 International Conference on 3D Vision (3DV).

[30]  David C. Hogg,et al.  3D Deformable Hand Models , 1996, Gesture Workshop.

[31]  Marcus A. Magnor,et al.  Tex2Shape: Detailed Full Human Body Geometry From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Alan Brunton,et al.  Multilinear Wavelets: A Statistical Shape Space for Human Faces , 2014, ECCV.

[33]  Marcus A. Magnor,et al.  Sparse localized deformation components , 2013, ACM Trans. Graph..

[34]  Zhenan Sun,et al.  Pose-Guided Photorealistic Face Rotation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Alberto Del Bimbo,et al.  A Sparse and Locally Coherent Morphable Face Model for Dense Semantic Correspondence Across Heterogeneous 3D Faces , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Michael J. Black,et al.  Learning to Reconstruct 3D Human Pose and Shape via Model-Fitting in the Loop , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Michael J. Black,et al.  The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.