Unpaired Depth Super-Resolution in the Wild

Depth maps captured with commodity sensors are often of low quality and resolution; these maps need to be enhanced to be used in many applications. State-of-the-art data-driven methods of depth map super-resolution rely on registered pairs of lowand high-resolution depth maps of the same scenes. Acquisition of real-world paired data requires specialized setups. Another alternative, generating low-resolution maps from high-resolution maps by subsampling, adding noise and other artificial degradation methods, does not fully capture the characteristics of real-world low-resolution images. As a consequence, supervised learning methods trained on such artificial paired data may not perform well on real-world low-resolution inputs. We consider an approach to depth super-resolution based on learning from unpaired data. While many techniques for unpaired image-to-image translation have been proposed, most fail to deliver effective hole-filling or reconstruct accurate surfaces using depth maps. We propose an unpaired learning method for depth super-resolution, which is based on a learnable degradation model, enhancement component and surface normal estimates as features to produce more accurate depth maps. We propose a benchmark for unpaired depth SR and demonstrate that our method outperforms existing unpaired methods and performs on par with paired. ?Joint second author contribution.

[1]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Daniel Cremers,et al.  Photometric Depth Super-Resolution , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jing Yang,et al.  To learn image super-resolution, use a GAN to learn how to do image degradation first , 2018, ECCV.

[4]  Wenbin Li,et al.  InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset , 2018, BMVC.

[5]  Yinda Zhang,et al.  Deep Depth Completion of a Single RGB-D Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Ruigang Yang,et al.  Channel Attention Based Iterative Residual Learning for Depth Map Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jianfei Cai,et al.  T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks , 2018, ECCV.

[8]  Gregory Shakhnarovich,et al.  Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Petros Daras,et al.  Self-Supervised Deep Depth Denoising , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Shunta Maeda,et al.  Unpaired Image Super-Resolution Using Pseudo-Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[12]  Stamatios Lefkimmiatis,et al.  Universal Denoising Networks : A Novel CNN Architecture for Image Denoising , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Wei Wu,et al.  Feedback Network for Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[15]  Bin Fang,et al.  Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Zitian Zhang,et al.  Multi-Scale Progressive Fusion Learning for Depth Map Super-Resolution , 2020, ArXiv.

[17]  Horst Bischof,et al.  Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[19]  Stamatios Lefkimmiatis,et al.  Non-local Color Image Denoising with Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yao Guo,et al.  Coupled Real-Synthetic Domain Adaptation for Real-World Deep Depth Enhancement , 2020, IEEE Transactions on Image Processing.

[21]  Baining Guo,et al.  Learning Texture Transformer Network for Image Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Dmitry Ulyanov,et al.  Self-supervised Depth Denoising Using Lower- and Higher-quality RGB-D sensors , 2020, 2020 International Conference on 3D Vision (3DV).

[23]  Daniel Cremers,et al.  Fight Ill-Posedness with Ill-Posedness: Single-shot Variational Depth Super-Resolution from Shading , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[25]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Horst Bischof,et al.  A Deep Primal-Dual Network for Guided Depth Super-Resolution , 2016, BMVC.

[27]  Wangmeng Zuo,et al.  Toward Convolutional Blind Denoising of Real Photographs , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Vladlen Koltun,et al.  Colored Point Cloud Registration Revisited , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jie Zhou,et al.  Structure-Preserving Super Resolution With Gradient Guidance , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Michal Irani,et al.  "Zero-Shot" Super-Resolution Using Deep Internal Learning , 2017, CVPR.

[32]  Matthias Nießner,et al.  Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[33]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[34]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[35]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[36]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[37]  Seungyong Lee,et al.  Reconstruction-Based Pairwise Depth Dataset for Depth Image Enhancement Using CNN , 2018, ECCV.

[38]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Zhan Xu,et al.  Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Minjae Kim,et al.  U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation , 2019, ICLR.

[41]  Horst Bischof,et al.  ATGV-Net: Accurate Depth Super-Resolution , 2016, ECCV.

[42]  Evgeny Burnaev,et al.  Perceptual Deep Depth Super-Resolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Gwanggil Jeon,et al.  Joint-adaptive bilateral depth map upsampling , 2014, Signal Process. Image Commun..

[44]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[45]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[46]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Thomas H. Li,et al.  StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Wei Wang,et al.  Deep Learning for Single Image Super-Resolution: A Brief Review , 2018, IEEE Transactions on Multimedia.