Predicting Forward & Backward Facial Depth Maps From a Single RGB Image For Mobile 3d AR Application

Cheap and fast 3D asset creation to enable AR/VR applications is a fast growing domain. This paper addresses a significant problem of reconstructing complete 3D information of a face in near real-time speed on a mobile phone. We propose a novel deep learning based solution to predict robust depth maps of a face, one forward facing and the other backward facing, from a single image from the wild. A critical contribution is that the proposed network is capable of learning the depths of the occluded part of the face too. This is achieved by training a fully convolutional neural network to learn the dual (forward and backward) depth maps, with a common encoder and two separate decoders. The 300W-LP, a cloud point dataset, is used to compute the required dual depth maps from the training data. The code and results will be made available at project page.

[1]  Matthew Turk,et al.  Computer Vision for Mobile Augmented Reality , 2015, Mobile Cloud Visual Media Computing.

[2]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[3]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[4]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Xi Zhou,et al.  Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network , 2018, ECCV.

[6]  Georgios Tzimiropoulos,et al.  Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Shivakumar Chandrasekaran,et al.  Augmented reality in broadcasting , 2017, 2017 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia).

[9]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Abhijit R. Joshi,et al.  E-learning system using Augmented Reality , 2016, 2016 International Conference on Computing Communication Control and automation (ICCUBEA).

[11]  William J. Christmas,et al.  A Multiresolution 3D Morphable Face Model and Fitting Framework , 2016, VISIGRAPP.

[12]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Sina Honari,et al.  Unsupervised Depth Estimation, 3D Face Rotation and Replacement , 2018, NeurIPS.

[14]  Chen Ke,et al.  3D Face Reconstruction Based on Convolutional Neural Network , 2017, 2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA).

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.