论文信息 - H-GAN: the power of GANs in your Hands

H-GAN: the power of GANs in your Hands

We present HandGAN (H-GAN), a cycle-consistent adversarial learning approach implementing multi-scale perceptual discriminators. It is designed to translate synthetic images of hands to the real domain. Synthetic hands provide complete ground-truth annotations, yet they are not representative of the target distribution of real-world data. We strive to provide the perfect blend of a realistic hand appearance with synthetic annotations. Relying on image-to-image translation, we improve the appearance of synthetic hands to approximate the statistical distribution underlying a collection of real images of hands. H-GAN tackles not only the cross-domain tone mapping but also structural differences in localized areas such as shading discontinuities. Results are evaluated on a qualitative and quantitative basis improving previous works. Furthermore, we relied on the hand classification task to claim our generated hands are statistically similar to the real domain of hands.

[1] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[2] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Yunchao Wei,et al. Perceptual Generative Adversarial Networks for Small Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Kihyuk Sohn,et al. Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Christian Theobalt,et al. Real-Time Hand Tracking Under Occlusion from an Egocentric RGB-D Sensor , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[7] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[8] Laura Leal-Taix'e,et al. Focus on Defocus: Bridging the Synthetic to Real Domain Gap for Depth Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Taesung Park,et al. CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[10] Antonio Torralba,et al. Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Fisher Yu,et al. Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Vladlen Koltun,et al. Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13] Banafsheh Rekabdar,et al. Improving the realism of synthetic images through a combination of adversarial and perceptual losses , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[14] Andrea Vedaldi,et al. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[15] Cheongjae Jang,et al. ElderSim: A Synthetic Data Generation Platform for Human Action Recognition in Eldercare Applications , 2020, ArXiv.

[16] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[17] Marcus Liwicki,et al. Improving Image Autoencoder Embeddings with Perceptual Loss , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[18] Daniel Cremers,et al. What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[19] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[20] José García Rodríguez,et al. The RobotriX: An Extremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21] Dmitry Ulyanov,et al. Image Manipulation with Perceptual Discriminators , 2018, ECCV.

[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Tomas Pfister,et al. Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Dacheng Tao,et al. Perceptual Adversarial Networks for Image-to-Image Transformation , 2017, IEEE Transactions on Image Processing.

[25] StRDAN: Synthetic-to-Real Domain Adaptation Network for Vehicle Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26] Kilian Q. Weinberger,et al. An empirical study on evaluation metrics of generative adversarial networks , 2018, ArXiv.

[27] Christian Theobalt,et al. GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28] Tae-Kyun Kim,et al. Weakly-Supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Alberto L. Sangiovanni-Vincentelli,et al. A Review of Single-Source Deep Unsupervised Visual Domain Adaptation , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[30] Cordelia Schmid,et al. How good is my GAN? , 2018, ECCV.

[31] Soo-Chang Pei,et al. Image Inpainting For Random Areas Using Dense Context Features , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[32] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33] Vincent Lepetit,et al. Domain Transfer for 3D Pose Estimation from Color Images without Manual Annotations , 2018, ACCV.

[34] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[35] Steven C. H. Hoi,et al. Deep Learning for Image Super-Resolution: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Zi Huang,et al. Cycle-consistent Conditional Adversarial Transfer Networks , 2019, ACM Multimedia.

[37] Hao Zhou,et al. SharinGAN: Combining Synthetic and Real Data for Unsupervised Geometry Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Arthur Gretton,et al. Demystifying MMD GANs , 2018, ICLR.