Multi-view 3D face reconstruction with deep recurrent neural networks

Image-based 3D face reconstruction has great potential in different areas, such as facial recognition, facial analysis, and facial animation. Due to the variations in image quality, single-image-based 3D face reconstruction might not be sufficient to accurately reconstruct a 3D face. To overcome this limitation, multi-view 3D face reconstruction uses multiple images of the same subject and aggregates complementary information for better accuracy. Though theoretically appealing, there are multiple challenges in practice. Among these challenges, the most significant is that it is difficult to establish coherent and accurate correspondence among a set of images, especially when these images are captured in different conditions. In this paper, we propose a method, Deep Recurrent 3D FAce Reconstruction (DRFAR), to solve the task ofmulti-view 3D face reconstruction using a subspace representation of the 3D facial shape and a deep recurrent neural network that consists of both a deep con-volutional neural network (DCNN) and a recurrent neural network (RNN). The DCNN disentangles the facial identity and the facial expression components for each single image independently, while the RNN fuses identity-related features from the DCNN and aggregates the identity specific contextual information, or the identity signal, from the whole set of images to predict the facial identity parameter, which is robust to variations in image quality and is consistent over the whole set of images. Through extensive experiments, we evaluate our proposed method and demonstrate its superiority over existing methods.

[1]  Yiying Tong,et al.  Adaptive 3D Face Reconstruction from Unconstrained Photo Collections , 2017, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ioannis A. Kakadiaris,et al.  Evaluation of a 3D-aided pose invariant 2D face recognition system , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[3]  Ioannis A. Kakadiaris,et al.  UHDB31: A Dataset for Better Understanding Face Recognition Across Pose and Illumination Variation , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[4]  Xiaoming Liu,et al.  Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ioannis A. Kakadiaris,et al.  End-to-End 3D Face Reconstruction with Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Qijun Zhao,et al.  Examplar coherent 3D face reconstruction from forensic mugshot database , 2017, Image Vis. Comput..

[7]  Tal Hassner,et al.  Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Rajib Rana,et al.  Gated Recurrent Unit (GRU) for Emotion Classification from Noisy Speech , 2016, ArXiv.

[9]  Matan Sela,et al.  Learning Detailed Face Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ira Kemelmacher-Shlizerman,et al.  Head Reconstruction from Internet Photos , 2016, ECCV.

[11]  R. Kimmel,et al.  3D Face Reconstruction by Learning from Synthetic Data , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[12]  Xiaoming Liu,et al.  Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yizhou Wang,et al.  Face Detection with End-to-End Integration of a ConvNet and a 3D Model , 2016, ECCV.

[14]  Gérard G. Medioni,et al.  Pose-Aware Face Recognition in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yuan Gao,et al.  Deep Gate Recurrent Neural Network , 2016, ACML.

[16]  Ioannis A. Kakadiaris,et al.  Rendering or normalization? An analysis of the 3D-aided pose-invariant face recognition , 2016, 2016 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA).

[17]  Mark Pauly,et al.  Dynamic 3D avatar creation from hand-held video input , 2015, ACM Trans. Graph..

[18]  Stefanos Zafeiriou,et al.  Automatic construction Of robust spherical harmonic subspaces , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yiying Tong,et al.  Unconstrained 3D face reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Katsushi Ikeuchi,et al.  Photometric Stereo Using Internet Images , 2014, 2014 2nd International Conference on 3D Vision.

[23]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Ira Kemelmacher-Shlizerman,et al.  Internet Based Morphable Model , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Jing Liu,et al.  A SFM-based sparse to dense 3D face reconstruction method robust to feature tracking errors , 2013, 2013 IEEE International Conference on Image Processing.

[27]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[28]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[29]  Ira Kemelmacher-Shlizerman,et al.  Face reconstruction in the wild , 2011, 2011 International Conference on Computer Vision.

[30]  Kang Ryoung Park,et al.  A SfM-based 3D face reconstruction method robust to self-occlusion by using a shape conversion matrix , 2011, Pattern Recognit..

[31]  E. Dubois,et al.  Wavelet Model-based Stereo for Fast, Robust Face Reconstruction , 2011, 2011 Canadian Conference on Computer and Robot Vision.

[32]  Yongtian Wang,et al.  Robust Photometric Stereo via Low-Rank Matrix Completion and Recovery , 2010, ACCV.

[33]  Jongmoo Choi,et al.  Accurate 3D face reconstruction from weakly calibrated wide baseline images with profile contours , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Yasuyuki Matsushita,et al.  Self-calibrating photometric stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[36]  Ioannis A. Kakadiaris,et al.  Three-Dimensional Face Recognition in the Presence of Facial Expressions: An Annotated Deformable Model Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[38]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[39]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[40]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  William A. P. Smith,et al.  Statistical 3D face shape estimation from occluding contours , 2016, Comput. Vis. Image Underst..

[42]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[43]  Cathy L. Schott,et al.  Experimental Results , 2009 .

[44]  I. Kakadiaris,et al.  End-to-end 3 D face reconstruction with deep neural networks , 2022 .