Local-linear-fitting-based matting for joint hole filling and depth upsampling of RGB-D images

Abstract. We propose an approach for jointly filling holes and upsampling depth information for RGB-D images captured with common acquisition systems, where RGB color information is available at all pixel locations whereas depth information is only available at lower resolution and entirely missing in small regions referred to as “holes.” Depth information completion is formulated as a minimization of an objective function composed of two additive terms. The first data fidelity term penalizes disagreement with the observed low-resolution data. The second regularization term penalizes weighted depth deviations from a local linear model in spatial coordinates, where the weights are experimentally determined to ensure consistency between the RGB color image and the estimated depth image. Analogous to techniques used for optimization formulations of image matting, the completed depth image is then obtained by solving a large sparse linear system of equations. We also propose a memory-efficient implementation of the proposed method based on the conjugate gradient method. Visual evaluation of results obtained with the proposed algorithm demonstrates that the method provides high-resolution depth maps that are consistent with the color images. Furthermore, the memory-efficient implementation significantly reduces memory requirements, allowing for computation of the upsampled, hole-filled depth maps for typical RGB-D images on normal workstation hardware. Quantitative comparisons demonstrate that the method offers an improvement in accuracy over the current state-of-the-art techniques for depth information completion. Importantly, statistical analysis, which we present in this paper, also reveals that prior evaluations of depth upsampling accuracy are potentially biased because the evaluations inappropriately used preprocessed hole-filled data as “ground truth.” An implementation of the proposed algorithm can be accessed and executed through Code Ocean: https://codeocean.com/capsule/5103691/tree/v1.

[1]  Xiaojin Gong,et al.  Guided Depth Enhancement via Anisotropic Diffusion , 2013, PCM.

[2]  Qiang Wu,et al.  Variable Bandwidth Weighting for Texture Copy Artifact Suppression in Guided Depth Upsampling , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Luc Van Gool,et al.  Multi-view Tracking of Multiple Targets with Dynamic Cameras , 2014, German Conference on Pattern Recognition.

[6]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Ming-Yu Liu,et al.  Joint Geodesic Upsampling of Depth Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Ping An,et al.  High accuracy hole filling for Kinect depth maps , 2014, Photonics Asia.

[9]  Jian Sun,et al.  Fast matting using large kernel matting Laplacian matrices , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  B. L. Welch THE SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO MEANS WHEN THE POPULATION VARIANCES ARE UNEQUAL , 1938 .

[11]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Qiang Wu,et al.  Robust Color Guided Depth Map Restoration , 2017, IEEE Transactions on Image Processing.

[13]  Michael S. Brown,et al.  High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[14]  Sebastian Thrun,et al.  An Application of Markov Random Fields to Range Sensing , 2005, NIPS.

[15]  Gaurav Sharma,et al.  Fusing structure from motion and lidar for dense accurate depth map estimation , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Kai-Lung Hua,et al.  Depth map super-resolution via Markov Random Fields without texture-copying artifacts , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Yao Zhao,et al.  Depth map upsampling using joint edge-guided convolutional neural network for virtual view synthesizing , 2017, J. Electronic Imaging.

[18]  Horst Bischof,et al.  Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Gaurav Sharma,et al.  A local-linear-fitting-based matting approach for accurate depth upsampling , 2016, 2016 IEEE Western New York Image and Signal Processing Workshop (WNYISPW).

[20]  James M. Coughlan,et al.  Manhattan World: Orientation and Outlier Detection by Bayesian Inference , 2003, Neural Computation.

[21]  Toshihiro Furukawa,et al.  High resolution depth image recovery algorithm based on the modeling of the sum of an average distance image and a surface image , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[22]  Chao Yu,et al.  Computational efficiency improvements for image colorization , 2013, Electronic Imaging.

[23]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Paul Newman,et al.  Image and Sparse Laser Fusion for Dense Scene Reconstruction , 2009, FSR.

[26]  Minh N. Do,et al.  A revisit to MRF-based depth map super-resolution and enhancement , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Chongyu Chen,et al.  Learning Dynamic Guidance for Depth Image Enhancement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yao Wang,et al.  Color-Guided Depth Recovery From RGB-D Data Using an Adaptive Autoregressive Model , 2014, IEEE Transactions on Image Processing.

[29]  Lai-Man Po,et al.  An adaptive background biased depth map hole-filling method for Kinect , 2013, IECON 2013 - 39th Annual Conference of the IEEE Industrial Electronics Society.

[30]  Alan L. Yuille,et al.  The Manhattan World Assumption: Regularities in Scene Statistics which Enable Bayesian Inference , 2000, NIPS.

[31]  Wei Liu,et al.  Semi-Global Weighted Least Squares in Image Filtering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  R. Merris Laplacian matrices of graphs: a survey , 1994 .

[33]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[34]  Benoit Huet Advances in Multimedia Information Processing – PCM 2013 , 2013, Lecture Notes in Computer Science.

[35]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.