Face Reconstruction on Mobile Devices Using a Height Map Shape Model and Fast Regularization

We present a system which is able to reconstruct human faces on mobile devices with only on-device processing using the sensors which are typically built into a current commodity smart phone. Such technology can for example be used for facial authentication purposes or as a fast preview for further post-processing. Our method uses recently proposed techniques which compute depth maps by passive multi-view stereo directly on the device. We propose an efficient method which recovers the geometry of the face from the typically noisy point cloud. First, we show that we can safely restrict the reconstruction to a 2.5D height map representation. Therefore we then propose a novel low dimensional height map shape model for faces which can be fitted to the input data efficiently even on a mobile phone. In order to be able to represent instance specific shape details, such as moles, we augment the reconstruction from the shape model with a distance map which can be regularized efficiently. We thoroughly evaluate our approach on synthetic and real data, thereby we use both high resolution depth data acquired using high quality multi-view stereo and depth data directly computed on mobile phones.

[1]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[2]  Jean-Philippe Pons,et al.  Efficient Multi-View Reconstruction of Large-Scale Scenes using Interest Points, Delaunay Triangulation and Graph Cuts , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  C. Zach Fast and High Quality Fusion of Depth Maps , 2008 .

[4]  Kostas Daniilidis,et al.  A Unifying Theory for Central Panoramic Systems and Practical Applications , 2000, ECCV.

[5]  Thabo Beeler,et al.  Real-time high-fidelity facial performance capture , 2015, ACM Trans. Graph..

[6]  William A. P. Smith,et al.  3D morphable face models revisited , 2009, CVPR.

[7]  Horst Bischof,et al.  A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Torsten Sattler,et al.  3D Modeling on the Go: Interactive 3D Reconstruction of Large-Scale Scenes on Mobile Devices , 2015, 2015 International Conference on 3D Vision.

[9]  Sami Romdhani,et al.  Optimal Step Nonrigid ICP Algorithms for Surface Registration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..

[11]  Alan Brunton,et al.  Multilinear Wavelets: A Statistical Shape Space for Human Faces , 2014, ECCV.

[12]  Pushmeet Kohli,et al.  Real-Time Face Reconstruction from a Single Depth Image , 2014, 2014 2nd International Conference on 3D Vision.

[13]  Paolo Cignoni,et al.  Metro: Measuring Error on Simplified Surfaces , 1998, Comput. Graph. Forum.

[14]  Ira Kemelmacher-Shlizerman,et al.  Total Moving Face Reconstruction , 2014, ECCV.

[15]  Martin D. Levine,et al.  Registering Multiview Range Data to Create 3D Computer Objects , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[17]  Jean-Philippe Pons,et al.  Minimizing the Multi-view Stereo Reprojection Error for Triangular Surface Meshes , 2008, BMVC.

[18]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[19]  Victor S. Lempitsky,et al.  Global Optimization for Shape Fitting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Hanspeter Pfister,et al.  Face transfer with multilinear models , 2005, SIGGRAPH 2005.

[22]  Siddhartha S. Srinivasa,et al.  Chisel: Real Time Large Scale 3D Reconstruction Onboard a Mobile Device using Spatially Hashed Signed Distance Fields , 2015, Robotics: Science and Systems.

[23]  Marc Pollefeys,et al.  Live Metric 3D Reconstruction on Mobile Phones , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Ruigang Yang,et al.  Multi-resolution real-time stereo on commodity graphics hardware , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[25]  Marc Pollefeys,et al.  Real-Time Direct Dense Matching on Fisheye Images Using Plane-Sweeping Stereo , 2014, 2014 2nd International Conference on 3D Vision.

[26]  Helder Araújo,et al.  Issues on the geometry of central catadioptric image formation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[27]  Reinhard Koch,et al.  Multi Viewpoint Stereo from Uncalibrated Video Sequences , 1998, ECCV.

[28]  Pushmeet Kohli,et al.  MobileFusion: Real-Time Volumetric Surface Reconstruction and Dense Tracking on Mobile Phones , 2015, IEEE Transactions on Visualization and Computer Graphics.

[29]  Andrew J. Davison,et al.  Live dense reconstruction with a single moving camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Jean-Philippe Pons,et al.  Towards high-resolution large-scale multi-view stereo , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[32]  Thomas Vetter,et al.  Expression invariant 3D face recognition with a Morphable Model , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[33]  Marc Levoy,et al.  Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[34]  Marc Pollefeys,et al.  Turning Mobile Phones into 3D Scanners , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[36]  B. Welford Note on a Method for Calculating Corrected Sums of Squares and Products , 1962 .

[37]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[38]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Patrick Rives,et al.  Single View Point Omnidirectional Camera Calibration from Planar Grids , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[40]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[41]  Thabo Beeler,et al.  High-quality single-shot capture of facial geometry , 2010, ACM Trans. Graph..