论文信息 - Predicting Forward & Backward Facial Depth Maps From a Single RGB Image For Mobile 3d AR Application

Predicting Forward & Backward Facial Depth Maps From a Single RGB Image For Mobile 3d AR Application

Cheap and fast 3D asset creation to enable AR/VR applications is a fast growing domain. This paper addresses a significant problem of reconstructing complete 3D information of a face in near real-time speed on a mobile phone. We propose a novel deep learning based solution to predict robust depth maps of a face, one forward facing and the other backward facing, from a single image from the wild. A critical contribution is that the proposed network is capable of learning the depths of the occluded part of the face too. This is achieved by training a fully convolutional neural network to learn the dual (forward and backward) depth maps, with a common encoder and two separate decoders. The 300W-LP, a cloud point dataset, is used to compute the required dual depth maps from the training data. The code and results will be made available at project page.

Mansi Sharma | P Avinash | Mansi Sharma | P. Avinash

[1] Matthew Turk,et al. Computer Vision for Mobile Augmented Reality , 2015, Mobile Cloud Visual Media Computing.

[2] Matthew Turk,et al. A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[3] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[4] Xiangyu Zhu,et al. Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Xi Zhou,et al. Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network , 2018, ECCV.

[6] Georgios Tzimiropoulos,et al. Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Shivakumar Chandrasekaran,et al. Augmented reality in broadcasting , 2017, 2017 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia).

[9] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Abhijit R. Joshi,et al. E-learning system using Augmented Reality , 2016, 2016 International Conference on Computing Communication Control and automation (ICCUBEA).

[11] William J. Christmas,et al. A Multiresolution 3D Morphable Face Model and Fitting Framework , 2016, VISIGRAPP.

[12] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13] Sina Honari,et al. Unsupervised Depth Estimation, 3D Face Rotation and Replacement , 2018, NeurIPS.

[14] Chen Ke,et al. 3D Face Reconstruction Based on Convolutional Neural Network , 2017, 2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA).

[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.