Real-time expression transfer for facial reenactment

We present a method for the real-time transfer of facial expressions from an actor in a source video to an actor in a target video, thus enabling the ad-hoc control of the facial expressions of the target actor. The novelty of our approach lies in the transfer and photorealistic re-rendering of facial deformations and detail into the target video in a way that the newly-synthesized expressions are virtually indistinguishable from a real video. To achieve this, we accurately capture the facial performances of the source and target subjects in real-time using a commodity RGB-D sensor. For each frame, we jointly fit a parametric model for identity, expression, and skin reflectance to the input color and depth data, and also reconstruct the scene lighting. For expression transfer, we compute the difference between the source and target expressions in parameter space, and modify the target parameters to match the source expressions. A major challenge is the convincing re-rendering of the synthesized target face into the corresponding video stream. This requires a careful consideration of the lighting and shading design, which both must correspond to the real-world environment. We demonstrate our method in a live setup, where we modify a video conference feed such that the facial expressions of a different person (e.g., translator) are matched in real-time.

[1]  Jihun Yu,et al.  Realtime facial animation with on-the-fly correctives , 2013, ACM Trans. Graph..

[2]  Ira Kemelmacher-Shlizerman,et al.  Total Moving Face Reconstruction , 2014, ECCV.

[3]  Jing Xiao,et al.  Real-time combined 2D+3D active appearance models , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[5]  Patrick Pérez,et al.  Automatic Face Reenactment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Simon Lucey,et al.  Real-time avatar animation from a single image , 2011, Face and Gesture 2011.

[7]  Hans-Peter Seidel,et al.  Lightweight binocular facial performance capture under uncontrolled lighting , 2012, ACM Trans. Graph..

[8]  Ira Kemelmacher-Shlizerman,et al.  Exploring photobios , 2011, ACM Trans. Graph..

[9]  Christian Theobalt,et al.  Reconstructing detailed dynamic face geometry from monocular video , 2013, ACM Trans. Graph..

[10]  Wojciech Matusik,et al.  Video face replacement , 2011, ACM Trans. Graph..

[11]  Hans-Peter Seidel,et al.  Exchanging Faces in Images , 2004, Comput. Graph. Forum.

[12]  Volker Schönefeld Spherical Harmonics , 2019, An Introduction to Radio Astronomy.

[13]  Qionghai Dai,et al.  A data-driven approach for facial expression synthesis in video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[15]  Luc Van Gool,et al.  Face/Off: live facial puppetry , 2009, SCA '09.

[16]  Mark Meyer,et al.  Generalized Barycentric Coordinates on Irregular Polygons , 2002, J. Graphics, GPU, & Game Tools.

[17]  Jovan Popovic,et al.  Deformation transfer for triangle meshes , 2004, ACM Trans. Graph..

[18]  Patrick Pérez,et al.  VDub: Modifying Face Video of Actors for Plausible Visual Alignment to a Dubbed Audio Track , 2015, Comput. Graph. Forum.

[19]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[20]  Henrique S. Malvar,et al.  Making Faces , 2019, Topoi.

[21]  Edward H. Adelson,et al.  PYRAMID METHODS IN IMAGE PROCESSING. , 1984 .

[22]  Jing Xiao,et al.  Vision-based control of 3D facial animation , 2003, SCA '03.

[23]  StammingerMarc,et al.  Real-time expression transfer for facial reenactment , 2015 .

[24]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[25]  Ahmed M. Elgammal,et al.  High Resolution Acquisition, Learning and Transfer of Dynamic 3‐D Facial Expressions , 2004, Comput. Graph. Forum.

[26]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..

[27]  Mark Pauly,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[28]  John P. Lewis,et al.  Universal capture: image-based facial animation for "The Matrix Reloaded" , 2003, SIGGRAPH '03.

[29]  Kun Zhou,et al.  3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[30]  Erika Chuang,et al.  Performance Driven Facial Animation using Blendshape Interpolation , 2002 .

[31]  Andrew W. Fitzgibbon,et al.  Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[32]  Pat Hanrahan,et al.  A signal-processing framework for inverse rendering , 2001, SIGGRAPH.

[33]  Ken-ichi Anjyo,et al.  Direct Manipulation Blendshapes , 2010, IEEE Computer Graphics and Applications.

[34]  M. Otaduy,et al.  Multi-scale capture of facial geometry and motion , 2007, ACM Trans. Graph..

[35]  Andrew Jones,et al.  Driving High-Resolution Facial Scans with Video Performance Capture , 2014, ACM Trans. Graph..

[36]  W. Heidrich,et al.  High resolution passive facial performance capture , 2010, ACM Trans. Graph..

[37]  Pieter Peers,et al.  Temporal upsampling of performance geometry using photometric alignment , 2010, TOGS.

[38]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[39]  Paul Debevec,et al.  The Digital Emily project: photoreal facial modeling and animation , 2009, SIGGRAPH '09.

[40]  Zicheng Liu,et al.  Expressive expression mapping with ratio images , 2001, SIGGRAPH.

[41]  Xin Tong,et al.  Accurate and Robust 3D Facial Capture Using a Single RGBD Camera , 2013, 2013 IEEE International Conference on Computer Vision.

[42]  Peter Eisert,et al.  Analyzing Facial Expressions for Virtual Conferencing , 1998, IEEE Computer Graphics and Applications.

[43]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[44]  Hanspeter Pfister,et al.  Face transfer with multilinear models , 2005, SIGGRAPH 2005.

[45]  Xin Tong,et al.  Automatic acquisition of high-fidelity facial performances using monocular videos , 2014, ACM Trans. Graph..

[46]  Jinxiang Chai,et al.  Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition , 2011, SIGGRAPH 2011.

[47]  Derek Bradley,et al.  High-quality passive facial performance capture using anchor frames , 2011, ACM Trans. Graph..

[48]  Yangang Wang,et al.  Online modeling for realtime facial animation , 2013, ACM Trans. Graph..

[49]  Tomaso A. Poggio,et al.  Reanimating Faces in Images and Video , 2003, Comput. Graph. Forum.

[50]  Jihun Yu,et al.  Unconstrained realtime facial performance capture , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Li Zhang,et al.  Spacetime faces: high resolution capture for modeling and animation , 2004, SIGGRAPH 2004.

[52]  Ira Kemelmacher-Shlizerman,et al.  Being John Malkovich , 2010, ECCV.

[53]  Lance Williams,et al.  Performance-driven facial animation , 1990, SIGGRAPH.

[54]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.