Coupled Real-Synthetic Domain Adaptation for Real-World Deep Depth Enhancement

Advances in depth sensing technologies have allowed simultaneous acquisition of both color and depth data under different environments. However, most depth sensors have lower resolution than that of the associated color channels and such a mismatch can affect applications that require accurate depth recovery. Existing depth enhancement methods use simplistic noise models and cannot generalize well under real-world conditions. In this paper, a coupled real-synthetic domain adaptation method is proposed, which enables domain transfer between high-quality depth simulators and real depth camera information for super-resolution depth recovery. The method first enables the realistic degradation from synthetic images, and then enhances degraded depth data to high quality with a color-guided sub-network. The key advantage of the work is that it generalizes well to real-world datasets without further training or fine-tuning. Detailed quantitative and qualitative results are presented, and it is demonstrated that the proposed method achieves improved performance compared to previous methods fine-tuned on the specific datasets.

[1]  Wangyong He,et al.  Synthesizing Depth Hand Images with GANs and Style Transfer for Hand Pose Estimation , 2019, Sensors.

[2]  Seungyong Lee,et al.  Reconstruction-Based Pairwise Depth Dataset for Depth Image Enhancement Using CNN , 2018, ECCV.

[3]  Xiaojin Gong,et al.  Guided Depth Enhancement via Anisotropic Diffusion , 2013, PCM.

[4]  Jiannan Wang,et al.  Encoding CT Anatomy Knowledge for Unpaired Chest X-ray Image Decomposition , 2019, MICCAI.

[5]  Yao Zhao,et al.  Simultaneous color-depth super-resolution with conditional generative adversarial networks , 2019, Pattern Recognit..

[6]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[7]  Dacheng Tao,et al.  Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Andreas Kolb,et al.  Real-time simulation of time-of-flight sensors , 2009, Simul. Model. Pract. Theory.

[9]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Ju Shen,et al.  Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Wen Li,et al.  Domain Generalization and Adaptation Using Low Rank Exemplar SVMs , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[13]  Lei Zhang,et al.  Transfer Adaptation Learning: A Decade Survey , 2019, IEEE transactions on neural networks and learning systems.

[14]  Jing Yang,et al.  To learn image super-resolution, use a GAN to learn how to do image degradation first , 2018, ECCV.

[15]  Kihong Park,et al.  Learning to Find Unpaired Cross-Spectral Correspondences , 2019, IEEE Transactions on Image Processing.

[16]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Horst Bischof,et al.  Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Timo Schairer,et al.  Fusion of range and color images for denoising and resolution enhancement with a non-local filter , 2010, Comput. Vis. Image Underst..

[19]  Xueying Qin,et al.  Deeply Supervised Depth Map Super-Resolution as Novel View Synthesis , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Ming Yang,et al.  Image Blind Denoising with Generative Adversarial Network Based Noise Modeling , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Lin Chen,et al.  Visual Recognition in RGB Images and Videos by Learning from RGB-D Data , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Lei Zhang,et al.  Multi-Adversarial Faster-RCNN for Unrestricted Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Wen Gao,et al.  Depth Restoration From RGB-D Data via Joint Adaptive Regularization and Thresholding on Manifolds , 2019, IEEE Transactions on Image Processing.

[24]  Russ Tedrake,et al.  A Supervised Approach to Predicting Noise in Depth Images , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[25]  Alan L. Yuille,et al.  UnrealCV: Connecting Computer Vision to Unreal Engine , 2016, ECCV Workshops.

[26]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Feng Liu,et al.  Depth Enhancement via Low-Rank Matrix Completion , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Yongtian Wang,et al.  Deep Surface Normal Estimation With Hierarchical RGB-D Fusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Jianfei Cai,et al.  T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks , 2018, ECCV.

[31]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Sergey Levine,et al.  Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[34]  Michael F. Cohen,et al.  Digital photography with flash and no-flash image pairs , 2004, ACM Trans. Graph..

[35]  Sertac Karaman,et al.  Self-Supervised Sparse-to-Dense: Self-Supervised Depth Completion from LiDAR and Monocular Camera , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[36]  Stefan Leutenegger,et al.  Deep learning a grasp function for grasping under gripper pose uncertainty , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Faisal Mahmood,et al.  Unsupervised Reverse Domain Adaptation for Synthetic Medical Images via Adversarial Training , 2017, IEEE Transactions on Medical Imaging.

[38]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[39]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[40]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[41]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[42]  Markus Vincze,et al.  An Empirical Evaluation of Ten Depth Cameras: Bias, Precision, Lateral Noise, Different Lighting Conditions and Materials, and Multiple Sensor Setups in Indoor Environments , 2019, IEEE Robotics & Automation Magazine.

[43]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[44]  Qiang Wu,et al.  Robust Color Guided Depth Map Restoration , 2017, IEEE Transactions on Image Processing.

[45]  Ziyan Wu,et al.  DepthSynth: Real-Time Realistic Synthetic Data Generation from CAD Models for 2.5D Recognition , 2017, 2017 International Conference on 3D Vision (3DV).

[46]  Toby P. Breckon,et al.  A comparative review of plausible hole filling strategies in the context of scene depth image completion , 2018, Comput. Graph..

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Yao Zhao,et al.  Two-stage filtering of compressed depth images with Markov Random Field , 2017, Signal Process. Image Commun..

[49]  Xueying Qin,et al.  Deep Depth Super-Resolution: Learning Depth Super-Resolution Using Deep Convolutional Neural Network , 2016, ACCV.

[50]  Fuzhen Zhuang,et al.  Supervised Representation Learning: Transfer Learning with Deep Autoencoders , 2015, IJCAI.

[51]  Gwanggil Jeon,et al.  Joint-adaptive bilateral depth map upsampling , 2014, Signal Process. Image Commun..

[52]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[53]  Wolfram Burgard,et al.  3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.

[54]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[55]  Guang-Zhong Yang,et al.  3-D Canonical Pose Estimation and Abnormal Gait Recognition With a Single RGB-D Camera , 2019, IEEE Robotics and Automation Letters.

[56]  Ping Li,et al.  Deep Color Guided Coarse-to-Fine Convolutional Network Cascade for Depth Image Super-Resolution , 2019, IEEE Transactions on Image Processing.

[57]  Toby P. Breckon,et al.  Generative adversarial framework for depth filling via Wasserstein metric, cosine transform and domain transfer , 2019, Pattern Recognit..

[58]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[60]  Guang-Zhong Yang,et al.  Context-Aware Depth and Pose Estimation for Bronchoscopic Navigation , 2019, IEEE Robotics and Automation Letters.

[61]  Xianming Liu,et al.  Connecting Image Denoising and High-Level Vision Tasks via Deep Learning , 2018, IEEE Transactions on Image Processing.

[62]  Dong Xu,et al.  Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition , 2017, ACM Comput. Surv..

[63]  Yao Wang,et al.  Color-Guided Depth Recovery From RGB-D Data Using an Adaptive Autoregressive Model , 2014, IEEE Transactions on Image Processing.

[64]  Wanli Ouyang,et al.  Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Yinda Zhang,et al.  Deep Depth Completion of a Single RGB-D Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[67]  Takayuki Okatani,et al.  Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps With Accurate Object Boundaries , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).