Deeply Supervised Depth Map Super-Resolution as Novel View Synthesis

Deep convolutional neural network (DCNN) has been successfully applied to depth map super-resolution and outperforms existing methods by a wide margin. However, there still exist two major issues with these DCNN-based depth map super-resolution methods that hinder the performance: 1) the low-resolution depth maps either need to be up-sampled before feeding into the network or substantial deconvolution has to be used and 2) the supervision (high-resolution depth maps) is only applied at the end of the network, thus it is difficult to handle large up-sampling factors, such as <inline-formula> <tex-math notation="LaTeX">$\times 8$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\times 16$ </tex-math></inline-formula>. In this paper, we propose a new framework to tackle the above problems. First, we propose to represent the task of depth map super-resolution as a series of novel view synthesis sub-tasks. The novel view synthesis sub-task aims at generating (synthesizing) a depth map from a different camera pose, which could be learned in parallel. Second, to handle large up-sampling factors, we present a deeply supervised network structure to enforce strong supervision in each stage of the network. Third, a multi-scale fusion strategy is proposed to effectively exploit the feature maps at different scales and handle the blocking effect. In this way, our proposed framework could deal with challenging depth map super-resolution efficiently under large up-sampling factors (e.g., <inline-formula> <tex-math notation="LaTeX">$\times 8$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\times 16$ </tex-math></inline-formula>). Our method only uses the low-resolution depth map as input, and the support of color image is not needed, which greatly reduces the restriction of our method. Extensive experiments on various benchmarking data sets demonstrate the superiority of our method over current state-of-the-art depth map super-resolution methods.

[1]  Jing Zhang,et al.  Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Fei Zhou,et al.  Nonlocal Pixel Selection for Multisurface Fitting-Based Super-Resolution , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[4]  Xueying Qin,et al.  Deep Depth Super-Resolution: Learning Depth Super-Resolution Using Deep Convolutional Neural Network , 2016, ACCV.

[5]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  H. Sebastian Seung,et al.  Natural Image Denoising with Convolutional Networks , 2008, NIPS.

[7]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[8]  Martin Kleinsteuber,et al.  A Joint Intensity and Depth Co-sparse Analysis Model for Depth Map Super-resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Horst Bischof,et al.  Variational Depth Superresolution Using Example-Based Edge Representations , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Narendra Ahuja,et al.  Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Wotao Yin,et al.  Iteratively reweighted algorithms for compressive sensing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13]  Thomas S. Huang,et al.  Deep Networks for Image Super-Resolution with Sparse Prior , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  查正军,et al.  A Unified Scheme for Super-resolution and Depth Estimation from Asymmetric Stereoscopic Video , 2016 .

[15]  William T. Freeman,et al.  Example-Based Super-Resolution , 2002, IEEE Computer Graphics and Applications.

[16]  Horst Bischof,et al.  ATGV-Net: Accurate Depth Super-Resolution , 2016, ECCV.

[17]  Tong Tong,et al.  Image Super-Resolution Using Dense Skip Connections , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Horst Bischof,et al.  A Deep Primal-Dual Network for Guided Depth Super-Resolution , 2016, BMVC.

[19]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[20]  Jian Yang,et al.  Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  David A. Forsyth,et al.  Sparse depth super resolution , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[23]  Ling Shao,et al.  Simultaneous Super-Resolution and Cross-Modality Synthesis of 3D Medical Images Using Weakly-Supervised Joint Convolutional Sparse Coding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[26]  Hongdong Li,et al.  Iteratively reweighted graph cut for multi-label MRFs with non-convex priors , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yoshimitsu Aoki,et al.  Depth image enhancement using local tangent plane approximations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Xiaoou Tang,et al.  Accelerating the Super-Resolution Convolutional Neural Network , 2016, ECCV.

[29]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[30]  Jean Ponce,et al.  Learning a convolutional neural network for non-uniform motion blur removal , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Bernhard Schölkopf,et al.  EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Ce Liu,et al.  Deep Convolutional Neural Network for Image Deconvolution , 2014, NIPS.

[33]  Horst Bischof,et al.  Fast and accurate image upscaling with super-resolution forests , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Rogério Schmidt Feris,et al.  Edge-Guided Single Depth Image Super Resolution , 2016, IEEE Transactions on Image Processing.

[35]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Gabriel J. Brostow,et al.  Patch Based Synthesis for Single Depth Image Super-Resolution , 2012, ECCV.

[37]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[38]  Aline Roumy,et al.  Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[39]  Horst Bischof,et al.  Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[40]  Fukui Kazuhiro,et al.  Realistic CG Stereo Image Dataset With Ground Truth Disparity Maps , 2012 .

[41]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Luc Van Gool,et al.  Anchored Neighborhood Regression for Fast Example-Based Super-Resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[45]  Nathan Silberman,et al.  Indoor scene segmentation using a structured light sensor , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[46]  Carsten Rother,et al.  Depth Super Resolution by Rigid Body Self-Similarity in 3D , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Bo Li,et al.  Monocular Depth Estimation with Hierarchical Fusion of Dilated CNNs and Soft-Weighted-Sum Inference , 2017, Pattern Recognit..

[48]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[49]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[50]  Kyoung Mu Lee,et al.  Deeply-Recursive Convolutional Network for Image Super-Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Kun Li,et al.  Depth Recovery Using an Adaptive Color-Guided Auto-Regressive Model , 2012, ECCV.

[52]  Michael S. Brown,et al.  High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[53]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[54]  Enhong Chen,et al.  Image Denoising and Inpainting with Deep Neural Networks , 2012, NIPS.

[55]  Ting-Zhu Huang,et al.  Single-Image Super-Resolution via an Iterative Reproducing Kernel Hilbert Space Method , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[56]  Chunhua Shen,et al.  Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Michael Elad,et al.  On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.