Rapid one-shot acquisition of dynamic VR avatars

We present a system for rapid acquisition of bespoke, animatable, full-body avatars including face texture and shape. A blendshape rig with a skeleton is used as a template for customization. Identity blendshapes are used to customize the body and face shape at the fitting stage, while animation blendshapes allow the face to be animated. The subject assumes a T-pose and a single snapshot is captured using a stereo RGB plus depth sensor rig. Our system automatically aligns a photo texture and fits the 3D shape of the face. The body shape is stylized according to body dimensions estimated from segmented depth. The face identity blendweights are optimised according to image-based facial landmarks, while a custom texture map for the face is generated by warping the input images to a reference texture according to the facial landmarks. The total capture and processing time is under 10 seconds and the output is a light-weight, game-engine-ready avatar which is recognizable as the subject. We demonstrate our system in a VR environment in which each user sees the other users' animated avatars through a VR headset with real-time audio-based facial animation and live body motion tracking, affording an enhanced level of presence and social engagement compared to generic avatars.

[1]  Jongmoo Choi,et al.  Accurate Full Body Scanning from a Single Fixed 3D Camera , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[2]  Luc Van Gool,et al.  Blue-c: a spatially immersive display and 3D video portal for telepresence , 2003, IPT/EGVE.

[3]  Wan-Chun Ma,et al.  The Digital Emily Project: Achieving a Photorealistic Digital Actor , 2010, IEEE Computer Graphics and Applications.

[4]  Michael J. Black,et al.  Estimating human shape and pose from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Nadia Magnenat-Thalmann,et al.  An example-based approach to human body manipulation , 2004, Graph. Model..

[6]  Jan-Michael Frahm,et al.  Scanning and tracking dynamic objects with commodity depth cameras , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[7]  Ari Shapiro,et al.  Avatar reshaping and automatic rigging using a deformable model , 2015, MIG.

[8]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[9]  Simon Lucey,et al.  Real-time avatar animation from a single image , 2011, Face and Gesture 2011.

[10]  Michael J. Black,et al.  Home 3D body scans from noisy image and range data , 2011, 2011 International Conference on Computer Vision.

[11]  Mark Pauly,et al.  Dynamic 3D avatar creation from hand-held video input , 2015, ACM Trans. Graph..

[12]  Thabo Beeler,et al.  High-quality single-shot capture of facial geometry , 2010, ACM Trans. Graph..

[13]  Andrew W. Fitzgibbon,et al.  3D scanning deformable objects with a single RGBD sensor , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alexei A. Efros,et al.  Image‐based Shaving , 2008, Comput. Graph. Forum.

[15]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[16]  Jovan Popovic,et al.  Automatic rigging and animation of 3D characters , 2007, ACM Trans. Graph..

[17]  Mark Pauly,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[18]  Ken-ichi Anjyo,et al.  Practice and Theory of Blendshape Facial Models , 2014, Eurographics.

[19]  Kun Zhou,et al.  Real-time facial animation with image-based dynamic avatars , 2016, ACM Trans. Graph..

[20]  Jian-Jun Zhang,et al.  3D Body Shapes Estimation from Dressed‐Human Silhouettes , 2016, Comput. Graph. Forum.

[21]  Ira Kemelmacher-Shlizerman,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 3d Face Reconstruction from a Single Image Using a Single Reference Face Shape , 2022 .

[22]  Hans-Peter Seidel,et al.  A Statistical Model of Human Pose and Body Shape , 2009, Comput. Graph. Forum.

[23]  Edilson de Aguiar,et al.  Capture and Stylization of Human Models , 2013, SIBGRAPI.

[24]  Hans-Peter Seidel,et al.  MovieReshape: tracking and reshaping of humans in videos , 2010, ACM Trans. Graph..

[25]  Chongyang Ma,et al.  Facial performance sensing head-mounted display , 2015, ACM Trans. Graph..

[26]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Yaser Sheikh,et al.  Photogeometric Scene Flow for High-Detail Dynamic 3D Reconstruction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Guillermo Sapiro,et al.  Navier-stokes, fluid dynamics, and image and video inpainting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[29]  Ming Zeng,et al.  Templateless Quasi-rigid Shape Modeling with Implicit Loop-Closure , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Jonathan T. Barron,et al.  3D self-portraits , 2013, ACM Trans. Graph..

[31]  Yangang Wang,et al.  Online modeling for realtime facial animation , 2013, ACM Trans. Graph..

[32]  Ligang Liu,et al.  Scanning 3D Full Human Bodies Using Kinects , 2012, IEEE Transactions on Visualization and Computer Graphics.

[33]  Markus H. Gross,et al.  An Immersive Bidirectional System for Life-size 3D Communication , 2016, CASA.

[34]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[35]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[36]  David J. Fleet,et al.  Physics-Based Human Motion Modeling for People Tracking : A Short Tutorial , 2009 .

[37]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[38]  Alvaro Collet,et al.  High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[39]  Tal Hassner,et al.  Age and Gender Estimation of Unfiltered Faces , 2014, IEEE Transactions on Information Forensics and Security.

[40]  Scott Schaefer,et al.  Image deformation using moving least squares , 2006, ACM Trans. Graph..

[41]  Gérard G. Medioni,et al.  Model-Assisted 3D Face Reconstruction from Video , 2007, AMFG.

[42]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[43]  Jean-Charles Bazin,et al.  Gaze correction with a single webcam , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[44]  Aaron Hertzmann,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2006) Learning a Correlated Model of Identity and Pose-dependent Body Shape Variation for Real-time Synthesis , 2022 .

[45]  D. Cohen-Or,et al.  Parametric reshaping of human bodies in images , 2010, ACM Trans. Graph..

[46]  Gérard G. Medioni,et al.  Rapid avatar capture and simulation using commodity depth sensors , 2014, Comput. Animat. Virtual Worlds.

[47]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH '05.

[48]  Didier Stricker,et al.  KinectAvatar: Fully Automatic Body Capture Using a Single Kinect , 2012, ACCV Workshops.

[49]  Peter Robinson,et al.  OpenFace: An open source facial behavior analysis toolkit , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[50]  H. A. M. Daanen,et al.  3D whole body scanners revisited , 2013, Displays.

[51]  Paul E. Debevec,et al.  Rapid photorealistic blendshapes from commodity RGB-D sensors , 2015, I3D.

[52]  Christian Theobalt,et al.  Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..