Fast capture of textured full-body avatar with RGB-D cameras

We present a practical system which can provide a textured full-body avatar within 3 s. It uses sixteen RGB-depth (RGB-D) cameras, ten of which are arranged to capture the body, while six target the important head region. The configuration of the multiple cameras is formulated as a constraint-based minimum set space-covering problem, which is approximately solved by a heuristic algorithm. The camera layout determined can cover the full-body surface of an adult, with geometric errors of less than 5 mm. After arranging the cameras, they are calibrated using a mannequin before scanning real humans. The 16 RGB-D images are all captured within 1 s, which both avoids the need for the subject to attempt to remain still for an uncomfortable period, and helps to keep pose changes between different cameras small. All scans are combined and processed to reconstruct the photorealistic textured mesh in 2 s. During both system calibration and working capture of a real subject, the high-quality RGB information is exploited to assist geometric reconstruction and texture stitching optimization.

[1]  Jan-Michael Frahm,et al.  Scanning and tracking dynamic objects with commodity depth cameras , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[2]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[3]  Björn E. Ottersten,et al.  Kinect Deform: Enhanced 3D Reconstruction of Non-rigidly Deforming Objects , 2014, 2014 2nd International Conference on 3D Vision.

[4]  Jörg Stückler,et al.  Dense Continuous-Time Tracking and Mapping with Rolling Shutter RGB-D Cameras , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Vladlen Koltun,et al.  Color map optimization for 3D reconstruction with consumer depth cameras , 2014, ACM Trans. Graph..

[6]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[7]  Alfred M. Bruckstein,et al.  RGBD-fusion: Real-time high precision depth recovery , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[9]  Daniel Cremers,et al.  Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Ralf Reulke,et al.  CALIBRATION OF A MULTIPLE STEREO AND RGB-D CAMERA SYSTEM FOR 3D HUMAN TRACKING , 2014 .

[11]  Daniel Cohen-Or,et al.  Seamless Montage for Texturing Models , 2010, Comput. Graph. Forum.

[12]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[13]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  M. Pauly,et al.  Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[15]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[16]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[17]  Sebastian Nowozin,et al.  A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Jiawen Chen,et al.  Scalable real-time volumetric surface reconstruction , 2013, ACM Trans. Graph..

[19]  Jonathan T. Barron,et al.  3D self-portraits , 2013, ACM Trans. Graph..

[20]  Daniel Cremers,et al.  Volumetric 3D mapping in real-time on a CPU , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Vladlen Koltun,et al.  Dense scene reconstruction with points of interest , 2013, ACM Trans. Graph..

[22]  江崎 修央,et al.  Xtion PRO LIVE 白杖システムによる上り/下り階段の認識に関する基礎的検討 , 2016 .

[23]  Victor S. Lempitsky,et al.  Seamless Mosaicing of Image-Based Texture Maps , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  John J. Leonard,et al.  Real-time large-scale dense RGB-D SLAM with volumetric fusion , 2014, Int. J. Robotics Res..

[25]  Sebastian Nowozin,et al.  A Comparative Study of Modern Inference Techniques for Structured Discrete Energy Minimization Problems , 2014, International Journal of Computer Vision.

[26]  Hans-Peter Seidel,et al.  Coherent Spatiotemporal Filtering, Upsampling and Rendering of RGBZ Videos , 2012, Comput. Graph. Forum.

[27]  Didier Stricker,et al.  KinectAvatar: Fully Automatic Body Capture Using a Single Kinect , 2012, ACCV Workshops.

[28]  Andrew W. Fitzgibbon,et al.  3D scanning deformable objects with a single RGBD sensor , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Daniel Cremers,et al.  CopyMe3D: Scanning and Printing Persons in 3D , 2013, GCPR.

[30]  Ligang Liu,et al.  Scanning 3D Full Human Bodies Using Kinects , 2012, IEEE Transactions on Visualization and Computer Graphics.

[31]  Andrew W. Fitzgibbon,et al.  Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[32]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[33]  Petros Daras,et al.  Real-Time, Full 3-D Reconstruction of Moving Foreground Objects From Multiple Consumer Depth Cameras , 2013, IEEE Transactions on Multimedia.

[34]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Yin Chen,et al.  Fast capture of personalized avatar using two Kinects , 2014 .