Unsupervised Enhancement of Real-World Depth Images Using Tri-Cycle GAN

Low quality depth poses a considerable challenge to computer vision algorithms. In this work we aim to enhance highly degraded, real-world depth images acquired by a low-cost sensor, for which an analytical noise model is unavailable. In the absence of clean ground-truth, we approach the task as an unsupervised domain-translation between the low-quality sensor domain and a high-quality sensor domain, represented using two unpaired training sets. We employ the highly-successful Cycle-GAN to this task, but find it to perform poorly in this case. Identifying the sources of the failure, we introduce several modifications to the framework, including a larger generator architecture, depth-specific losses that take into account missing pixels, and a novel Tri-Cycle loss which promotes information-preservation while addressing the asymmetry between the domains. We show that the resulting framework dramatically improves over the original Cycle-GAN both visually and quantitatively, extending its applicability to more challenging and asymmetric translation tasks.

[1]  Sertac Karaman,et al.  Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Guy Gilboa,et al.  A Depth Restoration Occlusionless Temporal Dataset , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[4]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Yochai Blau,et al.  Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff , 2019, ICML.

[6]  Fatih Murat Porikli,et al.  Depth Map Completion by Jointly Exploiting Blurry Color Images and Sparse Depth Maps , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  Yinda Zhang,et al.  Deep Depth Completion of a Single RGB-D Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[11]  C. Vogel Computational Methods for Inverse Problems , 1987 .

[12]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Fawzi Nashashibi,et al.  Sparse and Dense Data with CNNs: Depth Completion and Semantic Segmentation , 2018, 2018 International Conference on 3D Vision (3DV).

[14]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Luca Carlone,et al.  Sparse depth sensing for resource-constrained robots , 2017, Int. J. Robotics Res..

[17]  Adi Ben-Israel,et al.  Generalized inverses: theory and applications , 1974 .

[18]  Shinpei Kato,et al.  Non-Guided Depth Completion with Adversarial Networks , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[19]  Luc Van Gool,et al.  Generative Adversarial Networks for Extreme Learned Image Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  M. Pollefeys,et al.  DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yakov Miron,et al.  S-Flow GAN , 2019, ArXiv.

[22]  Thomas Brox,et al.  Sparsity Invariant CNNs , 2017, 2017 International Conference on 3D Vision (3DV).

[23]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[24]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[25]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[28]  Xuanqin Mou,et al.  Low-Dose CT Image Denoising Using a Generative Adversarial Network With Wasserstein Distance and Perceptual Loss , 2017, IEEE Transactions on Medical Imaging.