Reconstructing 3D Face Models by Incremental Aggregation and Refinement of Depth Frames

Face recognition from two-dimensional (2D) still images and videos is quite successful even with “in the wild” conditions. Instead, less consolidated results are available for the cases in which face data come from non-conventional cameras, such as infrared or depth. In this article, we investigate this latter scenario assuming that a low-resolution depth camera is used to perform face recognition in an uncooperative context. To this end, we propose, first, to automatically select a set of frames from the depth sequence of the camera because they provide a good view of the face in terms of pose and distance. Then, we design a progressive refinement approach to reconstruct a higher-resolution model from the selected low-resolution frames. This process accounts for the anisotropic error of the existing points in the current 3D model and the points in a newly acquired frame so that the refinement step can progressively adjust the point positions in the model using a Kalman-like estimation. The quality of the reconstructed model is evaluated by considering the error between the reconstructed models and their corresponding high-resolution scans used as ground truth. In addition, we performed face recognition using the reconstructed models as probes against a gallery of reconstructed models and a gallery with high-resolution scans. The obtained results confirm the possibility to effectively use the reconstructed models for the face recognition task.

[1]  Ioannis A. Kakadiaris,et al.  Using Facial Symmetry to Handle Pose Variations in Real-World 3D Face Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Alberto Del Bimbo,et al.  Submitted to Ieee Transactions on Cybernetics 1 3d Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold , 2022 .

[3]  O. Faugeras,et al.  A 3D World Model Builder with a Mobile Robot , 1992 .

[4]  D. Dowson,et al.  The Fréchet distance between multivariate normal distributions , 1982 .

[5]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[6]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time dynamic 3D surface reconstruction and interaction , 2011, SIGGRAPH '11.

[7]  Jie Li,et al.  Image super-resolution: The techniques, applications, and future , 2016, Signal Process..

[8]  Ioannis A. Kakadiaris,et al.  End-to-End 3D Face Reconstruction with Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  J. D. van Ouwerkerk,et al.  Image super-resolution survey , 2006, Image Vis. Comput..

[10]  Sridha Sridharan,et al.  Super-resolution for biometrics: A comprehensive survey , 2018, Pattern Recognit..

[11]  Björn E. Ottersten,et al.  Real-Time Enhancement of Dynamic Depth Videos with Non-Rigid Deformations , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Tieniu Tan,et al.  Combining local features for robust nose location in 3D facial data , 2006, Pattern Recognit. Lett..

[13]  Shiguang Shan,et al.  Deep Network Cascade for Image Super-resolution , 2014, ECCV.

[14]  Xuelong Li,et al.  A Comprehensive Survey to Face Hallucination , 2013, International Journal of Computer Vision.

[15]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[16]  Björn E. Ottersten,et al.  Enhancement of dynamic depth scenes by upsampling for precise super-resolution (UP-SR) , 2016, Comput. Vis. Image Underst..

[17]  Moon Gi Kang,et al.  Super-resolution image reconstruction: a technical overview , 2003, IEEE Signal Process. Mag..

[18]  Alberto Del Bimbo,et al.  3D Face Recognition Using Isogeodesic Stripes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Wilfried Philips,et al.  High resolution depth reconstruction from monocular images and sparse point clouds using deep convolutional neural network , 2017, Optical Engineering + Applications.

[20]  Georgios Tzimiropoulos,et al.  Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Sebastian Thrun,et al.  LidarBoost: Depth superresolution for ToF 3D shape scanning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yasushi Yagi,et al.  A Probabilistic Method for Aligning and Merging Range Images with Anisotropic Error Distribution , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[23]  Alberto Del Bimbo,et al.  Sparse Matching of Salient Facial Curves for Recognition of 3-D Faces With Missing Parts , 2013, IEEE Transactions on Information Forensics and Security.

[24]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[25]  Ira Kemelmacher-Shlizerman,et al.  The MegaFace Benchmark: 1 Million Faces for Recognition at Scale , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[27]  Alberto Del Bimbo,et al.  Matching 3D face scans using interest points and local histogram descriptors , 2013, Comput. Graph..

[28]  Li Xu,et al.  Shepard Convolutional Neural Networks , 2015, NIPS.

[29]  Shu Liang,et al.  3D Face Hallucination from a Single Depth Frame , 2014, 2014 2nd International Conference on 3D Vision.

[30]  Stefano Berretti,et al.  Representation, Analysis, and Recognition of 3D Humans , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[31]  Robert J. Woodham,et al.  Photometric method for determining surface orientation from multiple images , 1980 .

[32]  Xiaoou Tang,et al.  Deep Cascaded Bi-Network for Face Hallucination , 2016, ECCV.

[33]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Liang Lin,et al.  Attention-Aware Face Hallucination via Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Roland Siegwart,et al.  Kinect v2 for mobile robot navigation: Evaluation and modeling , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[36]  Jongmoo Choi,et al.  Comparing strategies for 3D face recognition from a 3D sensor , 2013, 2013 IEEE RO-MAN.

[37]  John A. Williams,et al.  Multiple view surface registration with error modeling and analysis , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[38]  Yuning Jiang,et al.  Learning Face Hallucination in the Wild , 2015, AAAI.

[39]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Harry Shum,et al.  Face Hallucination: Theory and Practice , 2007, International Journal of Computer Vision.

[41]  Tal Hassner,et al.  Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Thomas B. Moeslund,et al.  Super-resolution: a comprehensive survey , 2014, Machine Vision and Applications.

[43]  Berthold K. P. Horn,et al.  Shape from shading , 1989 .

[44]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[45]  Jongmoo Choi,et al.  Laser scan quality 3-D face modeling using a low-cost depth camera , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[46]  Chih-Yuan Yang,et al.  Structured Face Hallucination , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  William J. Christmas,et al.  Real-Time 3D Face Fitting and Texture Fusion on In-the-Wild Videos , 2017, IEEE Signal Processing Letters.

[48]  Alberto Del Bimbo,et al.  Face Recognition by Super-Resolved 3D Models From Consumer Depth Cameras , 2014, IEEE Transactions on Information Forensics and Security.

[49]  John Sell,et al.  The Xbox One System on a Chip and Kinect Sensor , 2014, IEEE Micro.

[50]  Hassen Drira,et al.  3D Face Recognition under Expressions, Occlusions, and Pose Variations , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Björn E. Ottersten,et al.  Real-time non-rigid multi-frame depth video super-resolution , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[53]  Zhaohui Wu,et al.  Super-Resolution of 3D Face , 2006, ECCV.

[54]  Alberto Del Bimbo,et al.  Reconstructing High-Resolution Face Models From Kinect Depth Sequences , 2016, IEEE Transactions on Information Forensics and Security.

[55]  Andriy Myronenko,et al.  Point Set Registration: Coherent Point Drift , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Xiaogang Wang,et al.  Hallucinating face by eigentransformation , 2005, IEEE Trans. Syst. Man Cybern. Part C.

[57]  Zhaohui Wu,et al.  Learning-based super-resolution of 3D face model , 2005, IEEE International Conference on Image Processing 2005.