Multi‐View Stereo on Consistent Face Topology

We present a multi‐view stereo reconstruction technique that directly produces a complete high‐fidelity head model with consistent facial mesh topology. While existing techniques decouple shape estimation and facial tracking, our framework jointly optimizes for stereo constraints and consistent mesh parameterization. Our method is therefore free from drift and fully parallelizable for dynamic facial performance capture. We produce highly detailed facial geometries with artist‐quality UV parameterization, including secondary elements such as eyeballs, mouth pockets, nostrils, and the back of the head. Our approach consists of deforming a common template model to match multi‐view input images of the subject, while satisfying cross‐view, cross‐subject, and cross‐pose consistencies using a combination of 2D landmark detection, optical flow, and surface and volumetric Laplacian regularization. Since the flow is never computed between frames, our method is trivially parallelized by processing each frame independently. Accurate rigid head pose is extracted using a PCA‐based dimension reduction and denoising scheme. We demonstrate high‐fidelity performance capture results with challenging head motion and complex facial expressions around eye and mouth regions. While the quality of our results is on par with the current state‐of‐the‐art, our approach can be fully parallelized, does not suffer from drift, and produces face models with production‐quality mesh topologies.

[1]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Daniel Cremers,et al.  Anisotropic Huber-L1 Optical Flow , 2009, BMVC.

[3]  Ira Kemelmacher-Shlizerman,et al.  Total Moving Face Reconstruction , 2014, ECCV.

[4]  Hao Li,et al.  Real-Time Facial Segmentation and Performance Capture from RGB Input , 2016, ECCV.

[5]  Derek Bradley,et al.  Rigid stabilization of facial expressions , 2014, ACM Trans. Graph..

[6]  Luc Van Gool,et al.  Face/Off: live facial puppetry , 2009, SCA '09.

[7]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[8]  Jinxiang Chai,et al.  Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition , 2011, SIGGRAPH 2011.

[9]  M. Akca GENERALIZED PROCRUSTES ANALYSIS AND ITS APPLICATIONS IN PHOTOGRAMMETRY , 2003 .

[10]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[11]  Hang Si,et al.  TetGen, a Delaunay-Based Quality Tetrahedral Mesh Generator , 2015, ACM Trans. Math. Softw..

[12]  Ronald Fedkiw,et al.  Fully automatic generation of anatomical face simulation models , 2015, Symposium on Computer Animation.

[13]  Demetri Terzopoulos,et al.  Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Derek Bradley,et al.  An anatomically-constrained local deformation model for monocular face capture , 2016, ACM Trans. Graph..

[15]  Henrique S. Malvar,et al.  Making Faces , 2019, Topoi.

[16]  Derek Bradley,et al.  High-quality passive facial performance capture using anchor frames , 2011, ACM Trans. Graph..

[17]  Jihun Yu,et al.  Unconstrained realtime facial performance capture , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jernej Barbic,et al.  Skin microstructure deformation with displacement map convolution , 2015, ACM Trans. Graph..

[19]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[20]  Kun Zhou,et al.  Real-time facial animation with image-based dynamic avatars , 2016, ACM Trans. Graph..

[21]  Mark Pauly,et al.  Example-based facial rigging , 2010, SIGGRAPH 2010.

[22]  Justus Thies,et al.  Real-time expression transfer for facial reenactment , 2015, ACM Trans. Graph..

[23]  Thabo Beeler,et al.  High-quality single-shot capture of facial geometry , 2010, ACM Trans. Graph..

[24]  Andrew Jones,et al.  Driving High-Resolution Facial Scans with Video Performance Capture , 2014, ACM Trans. Graph..

[25]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[26]  Martin Klaudiny,et al.  High-fidelity facial performance capture with non-sequential temporal alignment , 2012, FAA '12.

[27]  John M. Snyder,et al.  Large mesh deformation using the volumetric graph Laplacian , 2005, SIGGRAPH '05.

[28]  Thabo Beeler,et al.  Real-time high-fidelity facial performance capture , 2015, ACM Trans. Graph..

[29]  Norman I. Badler,et al.  Animating facial expressions , 1981, SIGGRAPH '81.

[30]  Derek Bradley,et al.  High resolution passive facial performance capture , 2010, ACM Trans. Graph..

[31]  Hans-Peter Seidel,et al.  Lightweight binocular facial performance capture under uncontrolled lighting , 2012, ACM Trans. Graph..

[32]  Hao Li,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[33]  Xin Tong,et al.  Automatic acquisition of high-fidelity facial performances using monocular videos , 2014, ACM Trans. Graph..

[34]  Ronald Fedkiw,et al.  Automatic determination of facial muscle activations from sparse motion capture marker data , 2005, ACM Trans. Graph..

[35]  Paul E. Debevec Digital ira: high-resolution facial performance playback , 2013, SIGGRAPH Computer Animation Festival.

[36]  Jean Ponce,et al.  Dense 3D motion capture for human faces , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Mark Sagar,et al.  The jester , 1999, SIGGRAPH '99.

[38]  Thabo Beeler,et al.  High-quality single-shot capture of facial geometry , 2010 .

[39]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  John P. Lewis,et al.  Universal capture: image-based facial animation for "The Matrix Reloaded" , 2003, SIGGRAPH '03.

[41]  Thabo Beeler,et al.  FaceDirector: Continuous Control of Facial Performance in Video , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Marcus A. Magnor,et al.  Sparse localized deformation components , 2013, ACM Trans. Graph..

[43]  P. Debevec,et al.  Creating a Photoreal Digital Actor: The Digital Emily Project , 2009, 2009 Conference for Visual Media Production.

[44]  Paul E. Debevec,et al.  Multiview face capture using polarized spherical gradient illumination , 2011, ACM Trans. Graph..

[45]  Marc Alexa,et al.  As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[46]  Yuting Ye,et al.  High fidelity facial animation capture and retargeting with contours , 2013, SCA '13.

[47]  Hao Li,et al.  Example-based facial rigging , 2010, ACM Transactions on Graphics.

[48]  Yangang Wang,et al.  Online modeling for realtime facial animation , 2013, ACM Trans. Graph..

[49]  Jihun Yu,et al.  Realtime facial animation with on-the-fly correctives , 2013, ACM Trans. Graph..

[50]  Christian Theobalt,et al.  Reconstructing detailed dynamic face geometry from monocular video , 2013, ACM Trans. Graph..

[51]  Christian Theobalt,et al.  Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..

[52]  Joseph J. Lim,et al.  High-fidelity facial and speech animation for VR HMDs , 2016, ACM Trans. Graph..

[53]  Mark Pauly,et al.  Dynamic 3D avatar creation from hand-held video input , 2015, ACM Trans. Graph..

[54]  Paul Debevec,et al.  The Digital Emily project: photoreal facial modeling and animation , 2009, SIGGRAPH '09.

[55]  Lance Williams,et al.  Performance-driven facial animation , 1990, SIGGRAPH Courses.