HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling

In this work, we tackle the challenging problem of learning-based single-view 3D hair modeling. Due to the great difficulty of collecting paired real image and 3D hair data, using synthetic data to provide prior knowledge for real domain becomes a leading solution. This unfortunately introduces the challenge of domain gap. Due to the inherent difficulty of realistic hair rendering, existing methods typically use orientation maps instead of hair images as input to bridge the gap. We firmly think an intermediate representation is essential, but we argue that orientation map using the dominant filtering-based methods is sensitive to uncertain noise and far from a competent representation. Thus, we first raise this issue up and propose a novel intermediate representation, termed as HairStep, which consists of a strand map and a depth map. It is found that HairStep not only provides sufficient information for accurate 3D hair modeling, but also is feasible to be inferred from real images. Specifically, we collect a dataset of 1,250 portrait images with two types of annotations. A learning framework is further designed to transfer real images to the strand map and depth map. It is noted that, an extra bonus of our new dataset is the first quantitative metric for 3D hair modeling. Our experiments show that HairStep narrows the domain gap between synthetic and real and achieves state-of-the-art performance on single-view 3D hair reconstruction.

[1]  Mohan S. Kankanhalli,et al.  Self-Supervised Global-Local Structure Modeling for Point Cloud Domain Adaptation with Reliable Voted Pseudo Labels , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Hongbo Fu,et al.  NeuralHDHair: Automatic High-fidelity Hair Modeling from a Single Image Using Implicit Neural Representations , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yong Wang,et al.  Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jiashi Feng,et al.  Domain Adaptation with Auxiliary Target Domain-Oriented Classifier , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Youyi Zheng,et al.  DeepSketchHair: Deep Sketch-Based 3D Hair Modeling , 2019, IEEE Transactions on Visualization and Computer Graphics.

[6]  Tao Yu,et al.  RobustFusion: Human Volumetric Capture with Data-Driven Visual Cues Using a RGBD Camera , 2020, ECCV.

[7]  Dongdong Chen,et al.  MichiGAN: multi-input-conditioned hair image generation for portrait editing , 2020, ACM Trans. Graph..

[8]  Xinggang Wang,et al.  Learning From Synthetic Images via Active Pseudo-Labeling , 2020, IEEE Transactions on Image Processing.

[9]  Hanbyul Joo,et al.  PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Youyi Zheng,et al.  Dynamic hair modeling from monocular videos using deep neural networks , 2019, ACM Trans. Graph..

[11]  P. Tan,et al.  A Neural Network for Detailed Human Depth Estimation From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Yu Qiao,et al.  RankSRGAN: Generative Adversarial Networks With Ranker for Image Super-Resolution , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Youyi Zheng,et al.  Hair-GAN: Recovering 3D hair structure from a single image using generative adversarial networks , 2019, Vis. Informatics.

[14]  Yaser Sheikh,et al.  Strand-Accurate Multi-View Hair Capture , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Dacheng Tao,et al.  Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Hao Li,et al.  Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Hao Li,et al.  3D hair synthesis using volumetric variational autoencoders , 2018, ACM Trans. Graph..

[19]  Meng Zhang,et al.  Modeling hair from an RGB-D camera , 2018, ACM Trans. Graph..

[20]  Jianfei Cai,et al.  T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks , 2018, ECCV.

[21]  Yi Zhou,et al.  Single-View Hair Reconstruction using Convolutional Neural Networks , 2018, ECCV.

[22]  Kwan-Yee Lin,et al.  Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Yue Qi,et al.  A Survey of Image-Based Techniques for Hair Modeling , 2018, IEEE Access.

[24]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Kun Zhou,et al.  AutoHair: fully automatic hair modeling from a single image , 2016, ACM Trans. Graph..

[26]  Weifeng Chen,et al.  Single-Image Depth Perception in the Wild , 2016, NIPS.

[27]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[28]  Ehsan Adeli,et al.  Deep Relative Attributes , 2015, ACCV.

[29]  Chongyang Ma,et al.  Single-view hair modeling using a hairstyle database , 2015, ACM Trans. Graph..

[30]  Chunhua Shen,et al.  Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Guosheng Lin,et al.  Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[34]  Chongyang Ma,et al.  Robust hair capture using simulated examples , 2014, ACM Trans. Graph..

[35]  Marc Pollefeys,et al.  Pulling Things out of Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Ce Liu,et al.  Depth Extraction from Video Using Non-parametric Sampling , 2012, ECCV.

[37]  Kun Zhou,et al.  Dynamic hair manipulation in images and videos , 2013, ACM Trans. Graph..

[38]  Szymon Rusinkiewicz,et al.  Wide-Baseline Hair Capture Using Strand-Based Refinement , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[40]  Kun Zhou,et al.  Single-view hair modeling for portrait manipulation , 2012, ACM Trans. Graph..

[41]  Szymon Rusinkiewicz,et al.  Multi-view hair capture using orientation fields , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Frédo Durand,et al.  Hair photobooth: geometric and photometric acquisition of real hairstyles , 2008, ACM Trans. Graph..

[43]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, ACM Trans. Graph..

[44]  Sylvain Paris,et al.  Capture of hair geometry from multiple images , 2004, ACM Trans. Graph..

[45]  A. Ng,et al.  Make3D: Learning 3D Scene Structure from a Single Still Image , 2022 .