High-resolution 3D surface strain magnitude using 2D camera and low-resolution depth sensor

Measures the 3-D surface strain impacted on the face during facial expressions.Describes an automatic approach for calibrating Kinect with external camera.Method shows high correlation between 3-D strain maps calculated from two views.Method robust to multiple depth resolutions.Over 100 subjects and 600 expressions used for testing. Generating 2-D strain maps of the face during facial expressions provides a useful feature that captures the bio-mechanics of facial skin tissue, and has had wide application in several research areas. However, most applications have been restricted to collecting data on a single pose. Moreover, methods that strictly use 2-D images for motion estimation can potentially suppress large strains because of projective distortions caused by the curvature of the face. This paper proposes a method that allows estimation of 3-D surface strain using a low-resolution depth sensor. The algorithm consists of automatically aligning a rough approximation of a 3-D surface with an external high resolution camera image. We provide experimental results that demonstrate the robustness of the method on a dataset collected using the Microsoft Kinect synchronized with two external high resolution cameras, as well as 101 subjects from a publicly available 3-D facial expression video database (BU4DFE).

[1]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[2]  Dmitry B. Goldgof,et al.  Method for Calculating View-Invariant 3D Optical Strain , 2012 .

[3]  Dmitry B. Goldgof,et al.  Evaluation of Facial Reconstructive Surgery on Patients with Facial Palsy Using Optical Strain , 2011, CAIP.

[4]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Maurício Pamplona Segundo,et al.  Automating 3D reconstruction pipeline by surf-based alignment , 2012, 2012 19th IEEE International Conference on Image Processing.

[6]  Michael A. Penna The incremental approximation of nonrigid motion , 1994 .

[7]  Jean Ponce,et al.  Dense 3D motion capture from synchronized video streams , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[9]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[11]  Hao Li,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[12]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[13]  Simon Lucey,et al.  Face alignment through subspace constrained mean-shifts , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Martin Klaudiny,et al.  High-Detail 3D Capture and Non-sequential Alignment of Facial Performance , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[15]  Dmitry B. Goldgof,et al.  Face recognition under camouflage and adverse illumination , 2010, 2010 Fourth IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[16]  Hans-Peter Seidel,et al.  Lightweight binocular facial performance capture under uncontrolled lighting , 2012, ACM Trans. Graph..

[17]  Lijun Yin,et al.  A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[18]  Tomaso A. Poggio,et al.  Reanimating Faces in Images and Video , 2003, Comput. Graph. Forum.

[19]  Dmitry B. Goldgof,et al.  Macro- and micro-expression spotting in long videos using spatio-temporal strain , 2011, Face and Gesture 2011.

[20]  Sen Wang,et al.  High resolution tracking of non-rigid 3D motion of densely sampled data using harmonic maps , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Roberto Scopigno,et al.  Image‐to‐Geometry Registration: a Mutual Information Method exploiting Illumination‐related Geometric Properties , 2009, Comput. Graph. Forum.

[22]  Ming Ouhyoung,et al.  Mirror MoCap: Automatic and efficient capture of dense 3D facial motion parameters from video , 2005, The Visual Computer.

[23]  Derek Bradley,et al.  High-quality passive facial performance capture using anchor frames , 2011, ACM Trans. Graph..

[24]  Wojciech Matusik,et al.  Multi-scale capture of facial geometry and motion , 2007, ACM Trans. Graph..

[25]  Richard Bowden,et al.  Kinecting the dots: Particle based scene flow from depth sensors , 2011, 2011 International Conference on Computer Vision.

[26]  Derek Nowrouzezahrai,et al.  Learning hatching for pen-and-ink illustration of surfaces , 2012, TOGS.

[27]  Derek Bradley,et al.  High resolution passive facial performance capture , 2010, ACM Trans. Graph..

[28]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.