Perceptual Deep Depth Super-Resolution

RGBD images, combining high-resolution color and lower-resolution depth from various types of depth sensors, are increasingly common. One can significantly improve the resolution of depth maps by taking advantage of color information; deep learning methods make combining color and depth information particularly easy. However, fusing these two sources of data may lead to a variety of artifacts. If depth maps are used to reconstruct 3D shapes, e.g., for virtual reality applications, the visual quality of upsampled images is particularly important. The main idea of our approach is to measure the quality of depth map upsampling using renderings of resulting 3D surfaces. We demonstrate that a simple visual appearance-based loss, when used with either a trained CNN or simply a deep prior, yields significantly improved 3D shapes, as measured by a number of existing perceptual metrics. We compare this approach with a number of existing optimization and learning-based techniques.

[1]  Bernhard Schölkopf,et al.  The Unreasonable Effectiveness of Texture Transfer for Single Image Super-resolution , 2018, ECCV Workshops.

[2]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Horst Bischof,et al.  Variational Depth Superresolution Using Example-Based Edge Representations , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Lena Maier-Hein,et al.  The HCI Stereo Metrics: Geometry-Aware Performance Analysis of Stereo Algorithms , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[6]  Rogério Schmidt Feris,et al.  Edge-Guided Single Depth Image Super Resolution , 2016, IEEE Trans. Image Process..

[7]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[8]  Sertac Karaman,et al.  Self-Supervised Sparse-to-Dense: Self-Supervised Depth Completion from LiDAR and Monocular Camera , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[9]  David Lopez-Paz,et al.  Optimizing the Latent Space of Generative Networks , 2017, ICML.

[10]  Thomas Brox,et al.  Sparsity Invariant CNNs , 2017, 2017 International Conference on 3D Vision (3DV).

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Yao Zhao,et al.  Simultaneous color-depth super-resolution with conditional generative adversarial networks , 2019, Pattern Recognit..

[13]  Yuan Zhou,et al.  Depth image super-resolution based on joint sparse coding , 2020, Pattern Recognit. Lett..

[14]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[15]  Daniel Cremers,et al.  Fight Ill-Posedness with Ill-Posedness: Single-shot Variational Depth Super-Resolution from Shading , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[17]  Horst Bischof,et al.  Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[19]  Xueying Qin,et al.  Deep Depth Super-Resolution: Learning Depth Super-Resolution Using Deep Convolutional Neural Network , 2016, ACCV.

[20]  Cheolkon Jung,et al.  Single Depth Image Super-Resolution Using Convolutional Neural Networks , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Lihi Zelnik-Manor,et al.  Maintaining Natural Image Statistics with the Contextual Loss , 2018, ACCV.

[22]  Takayuki Hamamoto,et al.  Depth upsampling by depth prediction , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[23]  Xiaojin Gong,et al.  A Normalized Convolutional Neural Network for Guided Sparse Depth Upsampling , 2018, IJCAI.

[24]  Wolfgang Heidrich,et al.  HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions , 2011, ACM Trans. Graph..

[25]  Chang Dong Yoo,et al.  Perception-Enhanced Image Super-Resolution via Relativistic Generative Adversarial Networks , 2018, ECCV Workshops.

[26]  Yao Wang,et al.  Color-Guided Depth Recovery From RGB-D Data Using an Adaptive Autoregressive Model , 2014, IEEE Transactions on Image Processing.

[27]  Jean Ponce,et al.  Deformable Kernel Networks for Joint Image Filtering , 2019, International Journal of Computer Vision.

[28]  Xueying Qin,et al.  Deeply Supervised Depth Map Super-Resolution as Novel View Synthesis , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Xiang Cao,et al.  Joint convolutional neural pyramid for depth map super-resolution , 2018, ArXiv.

[30]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[31]  Thomas S. Huang,et al.  Image Super-Resolution via Dual-State Recurrent Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Lihi Zelnik-Manor,et al.  Learning to Maintain Natural Image Statistics , 2018, ArXiv.

[33]  Yun Fu,et al.  Residual Dense Network for Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Ruigang Yang,et al.  Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network , 2018, ECCV.

[35]  Gregory Shakhnarovich,et al.  Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[37]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yun Fu,et al.  Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[39]  Bastian Goldlücke,et al.  A Dataset and Evaluation Methodology for Depth Estimation on 4D Light Fields , 2016, ACCV.

[40]  Simon Lucey,et al.  Deep Convolutional Compressed Sensing for LiDAR Depth Completion , 2018, ACCV.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Jingyu Yang,et al.  Depth Super-Resolution From RGB-D Pairs With Transform and Spatial Domain Regularization , 2018, IEEE Transactions on Image Processing.

[43]  Qiang Wu,et al.  Minimum Spanning Forest With Embedded Edge Inconsistency Measurement Model for Guided Depth Map Enhancement , 2018, IEEE Transactions on Image Processing.

[44]  Yao Zhao,et al.  Simultaneously Color-Depth Super-Resolution with Conditional Generative Adversarial Network , 2017, ArXiv.

[45]  Lizhen Wang,et al.  DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs , 2018, ECCV.

[46]  Horst Bischof,et al.  A Deep Primal-Dual Network for Guided Depth Super-Resolution , 2016, BMVC.

[47]  Sertac Karaman,et al.  Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Chih-Yuan Yang,et al.  Learning a No-Reference Quality Metric for Single-Image Super-Resolution , 2016, Comput. Vis. Image Underst..

[49]  Gianluca Agresti,et al.  Deep Learning for Confidence Information in Stereo and ToF Data Fusion , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[50]  Jean Ponce,et al.  Robust Guided Image Filtering Using Nonconvex Potentials , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Zhao Chen,et al.  Estimating Depth from RGB and Sparse Sensing , 2018, ECCV.

[52]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[53]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[54]  Rong Chen,et al.  Bi-GANs-ST for Perceptual Image Super-resolution , 2018, ECCV Workshops.

[55]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[56]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[58]  Jan Kautz,et al.  Loss Functions for Image Restoration With Neural Networks , 2017, IEEE Transactions on Computational Imaging.

[59]  Yifan Wang,et al.  A Fully Progressive Approach to Single-Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[60]  Chongyu Chen,et al.  Learning Dynamic Guidance for Depth Image Enhancement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[62]  Daniel Cremers,et al.  Depth Super-Resolution Meets Uncalibrated Photometric Stereo , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[63]  Jun-Hyuk Kim,et al.  Generative adversarial network-based image super-resolution using perceptual content losses , 2018, ECCV Workshops.

[64]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[65]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[66]  Chao Dong,et al.  Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.