H-GAN: the power of GANs in your Hands

We present HandGAN (H-GAN), a cycle-consistent adversarial learning approach implementing multi-scale perceptual discriminators. It is designed to translate synthetic images of hands to the real domain. Synthetic hands provide complete ground-truth annotations, yet they are not representative of the target distribution of real-world data. We strive to provide the perfect blend of a realistic hand appearance with synthetic annotations. Relying on image-to-image translation, we improve the appearance of synthetic hands to approximate the statistical distribution underlying a collection of real images of hands. H-GAN tackles not only the cross-domain tone mapping but also structural differences in localized areas such as shading discontinuities. Results are evaluated on a qualitative and quantitative basis improving previous works. Furthermore, we relied on the hand classification task to claim our generated hands are statistically similar to the real domain of hands.

[1]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yunchao Wei,et al.  Perceptual Generative Adversarial Networks for Small Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Kihyuk Sohn,et al.  Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Christian Theobalt,et al.  Real-Time Hand Tracking Under Occlusion from an Egocentric RGB-D Sensor , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[7]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[8]  Laura Leal-Taix'e,et al.  Focus on Defocus: Bridging the Synthetic to Real Domain Gap for Depth Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[10]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Banafsheh Rekabdar,et al.  Improving the realism of synthetic images through a combination of adversarial and perceptual losses , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[14]  Andrea Vedaldi,et al.  Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[15]  Cheongjae Jang,et al.  ElderSim: A Synthetic Data Generation Platform for Human Action Recognition in Eldercare Applications , 2020, ArXiv.

[16]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[17]  Marcus Liwicki,et al.  Improving Image Autoencoder Embeddings with Perceptual Loss , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[18]  Daniel Cremers,et al.  What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[19]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  José García Rodríguez,et al.  The RobotriX: An Extremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Dmitry Ulyanov,et al.  Image Manipulation with Perceptual Discriminators , 2018, ECCV.

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Dacheng Tao,et al.  Perceptual Adversarial Networks for Image-to-Image Transformation , 2017, IEEE Transactions on Image Processing.

[25]  StRDAN: Synthetic-to-Real Domain Adaptation Network for Vehicle Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26]  Kilian Q. Weinberger,et al.  An empirical study on evaluation metrics of generative adversarial networks , 2018, ArXiv.

[27]  Christian Theobalt,et al.  GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Tae-Kyun Kim,et al.  Weakly-Supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Alberto L. Sangiovanni-Vincentelli,et al.  A Review of Single-Source Deep Unsupervised Visual Domain Adaptation , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Cordelia Schmid,et al.  How good is my GAN? , 2018, ECCV.

[31]  Soo-Chang Pei,et al.  Image Inpainting For Random Areas Using Dense Context Features , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[32]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Vincent Lepetit,et al.  Domain Transfer for 3D Pose Estimation from Color Images without Manual Annotations , 2018, ACCV.

[34]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[35]  Steven C. H. Hoi,et al.  Deep Learning for Image Super-Resolution: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Zi Huang,et al.  Cycle-consistent Conditional Adversarial Transfer Networks , 2019, ACM Multimedia.

[37]  Hao Zhou,et al.  SharinGAN: Combining Synthetic and Real Data for Unsupervised Geometry Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.