Depth Super-Resolution via Deep Controllable Slicing Network

Due to the imaging limitation of depth sensors, high-resolution (HR) depth maps are often difficult to be acquired directly, thus effective depth super-resolution (DSR) algorithms are needed to generate HR output from its low-resolution (LR) counterpart. Previous methods treat all depth regions equally without considering different extents of degradation at region-level, and regard DSR under different scales as independent tasks without considering the modeling of different scales, which impede further performance improvement and practical use of DSR. To alleviate these problems, we propose a deep controllable slicing network from a novel perspective. Specifically, our model is to learn a set of slicing branches in a divide-and-conquer manner, parameterized by a distance-aware weighting scheme to adaptively aggregate different depths in an ensemble. Each branch that specifies a depth slice (e.g., the region in some depth range) tends to yield accurate depth recovery. Meanwhile, a scale-controllable module that extracts depth features under different scales is proposed and inserted into the front of slicing network, and enables finely-grained control of the depth restoration results of slicing network with a scale hyper-parameter. Extensive experiments on synthetic and real-world benchmark datasets demonstrate that our method achieves superior performance.

[1]  Horst Bischof,et al.  Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  Wushao Wen,et al.  Difficulty-Aware Image Super Resolution via Deep Adaptive Dual-Network , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[3]  Feng Wu,et al.  LA-Net: Layout-Aware Dense Network for Monocular Depth Estimation , 2018, ACM Multimedia.

[4]  A. Mertins,et al.  Time-of-flight depth image denoising using prior noise information , 2010, IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS.

[5]  Gregory Shakhnarovich,et al.  Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Ming-Ming Cheng,et al.  Nonlinear Regression via Deep Negative Correlation Learning , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Xu Gao,et al.  OSMO: Online Specific Models for Occlusion in Multiple Object Tracking under Surveillance Scene , 2018, ACM Multimedia.

[9]  Zhongming Jin,et al.  Previewer for Multi-Scale Object Detector , 2018, ACM Multimedia.

[10]  Seoung Wug Oh,et al.  Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Narendra Ahuja,et al.  Joint Image Filtering with Deep Convolutional Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Haojie Li,et al.  Depth Super-Resolution with Deep Edge-Inference Network and Edge-Guided Depth Filling , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Chongyu Chen,et al.  Learning Dynamic Guidance for Depth Image Enhancement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jinhui Tang,et al.  Spatially Variant Linear Representation Models for Joint Filtering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Guo-Jun Qi,et al.  Hierarchically Gated Deep Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xiahai Zhuang,et al.  Multi-Scale Deep Neural Networks for Real Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Jean Ponce,et al.  Deformable kernel networks for guided depth map upsampling , 2019, ArXiv.

[18]  Horst Bischof,et al.  ATGV-Net: Accurate Depth Super-Resolution , 2016, ECCV.

[19]  Richang Hong,et al.  Deep Spatial Pyramid Features Collaborative Reconstruction for Partial Person ReID , 2019, ACM Multimedia.

[20]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Minh N. Do,et al.  Fast Guided Global Interpolation for Depth and Motion , 2016, ECCV.

[22]  Huazhu Fu,et al.  Hierarchical Features Driven Residual Learning for Depth Map Super-Resolution , 2019, IEEE Transactions on Image Processing.

[23]  Rui Xu,et al.  Depth upsampling based on deep edge-aware learning , 2020, Pattern Recognit..

[24]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Jan Dirk Wegner,et al.  Guided Super-Resolution As Pixel-to-Pixel Transformation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[27]  Michael S. Brown,et al.  High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[28]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[29]  Hang Su,et al.  Pixel-Adaptive Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jia-Bin Huang,et al.  Guided Image-to-Image Translation With Bi-Directional Feature Transformation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[33]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[34]  Kyungdon Joo,et al.  Accurate 3D Reconstruction from Small Motion Clip for Rolling Shutter Cameras , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[36]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[37]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  Tieniu Tan,et al.  Meta-SR: A Magnification-Arbitrary Network for Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Ping Li,et al.  Deep Color Guided Coarse-to-Fine Convolutional Network Cascade for Depth Image Super-Resolution , 2019, IEEE Transactions on Image Processing.

[41]  Xiao Liu,et al.  Adapting Image Super-Resolution State-Of-The-Arts and Learning Multi-Model Ensemble for Video Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  Xiaoou Tang,et al.  Deep Network Interpolation for Continuous Imagery Effect Transition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).