Im2Fit: Fast 3D Model Fitting and Anthropometrics using Single Consumer Depth Camera and Synthetic Data

Recent advances in consumer depth sensors have created many opportunities for human body measurement and modeling. Estimation of 3D body shape is particularly useful for fashion e-commerce applications such as virtual try-on or fit personalization. In this paper, we propose a method for capturing accurate human body shape and anthropometrics from a single consumer grade depth sensor. We first generate a large dataset of synthetic 3D human body models using real-world body size distributions. Next, we estimate key body measurements from a single monocular depth image. We combine body measurement estimates with local geometry features around key joint positions to form a robust multi-dimensional feature vector. This allows us to conduct a fast nearest-neighbor search to every sample in the dataset and return the closest one. Compared to existing methods, our approach is able to predict accurate full body parameters from a partial view using measurement parameters learned from the synthetic dataset. Furthermore, our system is capable of generating 3D human mesh models in real-time, which is significantly faster than methods which attempt to model shape and pose deformations. To validate the efficiency and applicability of our system, we collected a dataset that contains frontal and back scans of 83 clothed people with ground truth height and weight. Experiments on real-world dataset show that the proposed method can achieve real-time performance with competing results achieving an average error of 1.9 cm in estimated measurements.

[1]  Angelos Barmpoutis,et al.  Tensor Body: Real-Time Reconstruction of the Human Body and Avatar Synthesis From RGB-D , 2013, IEEE Transactions on Cybernetics.

[2]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH '05.

[3]  Michael J. Black,et al.  Estimating human shape and pose from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, SIGGRAPH 2008.

[5]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Kathleen M. Robinette,et al.  Civilian American and European Surface Anthropometry Resource (CAESAR), Final Report. Volume 1. Summary , 2002 .

[7]  Katherine M Flegal,et al.  Mean body weight, height, and body mass index, United States 1960-2002. , 2004, Advance data.

[8]  Michael J. Black,et al.  Model-based anthropometry: Predicting measurements from 3D human scans in multiple poses , 2014, IEEE Winter Conference on Applications of Computer Vision.

[9]  Michael J. Black,et al.  The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[10]  Michael J. Black,et al.  Home 3D body scans from noisy image and range data , 2011, 2011 International Conference on Computer Vision.

[11]  Michael J. Black,et al.  Detailed Human Shape and Pose from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Paul J. Besl,et al.  Method for registration of 3-D shapes , 1992, Other Conferences.

[13]  Zicheng Liu,et al.  Tensor-Based Human Body Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Andrew W. Fitzgibbon,et al.  Direct Least Square Fitting of Ellipses , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  E. S. Maini Enhanced Direct Least Square Fitting of Ellipses , 2006, Int. J. Pattern Recognit. Artif. Intell..

[16]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[17]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[18]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19]  Hans-Peter Seidel,et al.  Estimating body shape of dressed humans , 2009, Comput. Graph..

[20]  Michael J. Black,et al.  FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..