Learning Dynamic Guidance for Depth Image Enhancement

The depth images acquired by consumer depth sensors (e.g., Kinect and ToF) usually are of low resolution and insufficient quality. One natural solution is to incorporate with high resolution RGB camera for exploiting their statistical correlation. However, most existing methods are intuitive and limited in characterizing the complex and dynamic dependency between intensity and depth images. To address these limitations, we propose a weighted analysis representation model for guided depth image enhancement, which advances the conventional methods in two aspects: (i) task driven learning and (ii) dynamic guidance. First, we generalize the analysis representation model by including a guided weight function for dependency modeling. And the task-driven learning formulation is introduced to obtain the optimized guidance tailored to specific enhancement task. Second, the depth image is gradually enhanced along with the iterations, and thus the guidance should also be dynamically adjusted to account for the updating of depth image. To this end, stage-wise parameters are learned for dynamic guidance. Experiments on guided depth image upsampling and noisy depth image restoration validate the effectiveness of our method.

[1]  Horst Bischof,et al.  Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  Martin Kleinsteuber,et al.  A Joint Intensity and Depth Co-sparse Analysis Model for Depth Map Super-resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[4]  Michael S. Brown,et al.  High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[5]  Sebastian Thrun,et al.  An Application of Markov Random Fields to Range Sensing , 2005, NIPS.

[6]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization and Its Applications to Low Level Vision , 2016, International Journal of Computer Vision.

[7]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[9]  Song-Chun Zhu Filters, Random Fields and Maximum Entropy (FRAME): Towards a Unified Theory for Texture Modeling , 1998 .

[10]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.

[11]  Feng Liu,et al.  Depth Enhancement via Low-Rank Matrix Completion , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michael Elad,et al.  Analysis versus synthesis in signal priors , 2006, 2006 14th European Signal Processing Conference.

[13]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization with Application to Image Denoising , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Ruigang Yang,et al.  Reliability Fusion of Time-of-Flight Depth and Stereo Geometry for High Quality Depth Maps , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Horst Bischof,et al.  A Deep Primal-Dual Network for Guided Depth Super-Resolution , 2016, BMVC.

[16]  Michael Elad,et al.  Analysis K-SVD: A Dictionary-Learning Algorithm for the Analysis Sparse Model , 2013, IEEE Transactions on Signal Processing.

[17]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[18]  Yunjin Chen,et al.  Insights Into Analysis Operator Learning: From Patch-Based Sparse Models to Higher Order MRFs , 2014, IEEE Transactions on Image Processing.

[19]  Ivana Tosic,et al.  Learning Joint Intensity-Depth Sparse Representations , 2012, IEEE Transactions on Image Processing.

[20]  Wei Yu,et al.  On learning optimized reaction diffusion processes for effective image restoration , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Emmanuel J. Candès,et al.  The curvelet transform for image denoising , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[22]  Michael J. Black,et al.  Fields of Experts , 2009, International Journal of Computer Vision.

[23]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Yao Wang,et al.  Color-Guided Depth Recovery From RGB-D Data Using an Adaptive Autoregressive Model , 2014, IEEE Transactions on Image Processing.

[25]  Sebastian Thrun,et al.  A Noise‐aware Filter for Real‐time Depth Upsampling , 2008 .

[26]  Lei Zhang,et al.  Discriminative learning of iteration-wise priors for blind deconvolution , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[28]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[29]  Stefan Roth,et al.  Shrinkage Fields for Effective Image Restoration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, SIGGRAPH 2007.

[31]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Jean Ponce,et al.  Robust image filtering using joint static and dynamic guidance , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Michael F. Cohen,et al.  Digital photography with flash and no-flash image pairs , 2004, ACM Trans. Graph..

[34]  Stephen Lin,et al.  Data-driven depth map refinement via multi-scale sparse representation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Li Xu,et al.  Mutual-Structure for Joint Filtering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2008 .