论文信息 - Data-driven depth map refinement via multi-scale sparse representation

Data-driven depth map refinement via multi-scale sparse representation

Depth maps captured by consumer-level depth cameras such as Kinect are usually degraded by noise, missing values, and quantization. In this paper, we present a data-driven approach for refining degraded RAWdepth maps that are coupled with an RGB image. The key idea of our approach is to take advantage of a training set of high-quality depth data and transfer its information to the RAW depth map through multi-scale dictionary learning. Utilizing a sparse representation, our method learns a dictionary of geometric primitives which captures the correlation between high-quality mesh data, RAW depth maps and RGB images. The dictionary is learned and applied in a manner that accounts for various practical issues that arise in dictionary-based depth refinement. Compared to previous approaches that only utilize the correlation between RAW depth maps and RGB images, our method produces improved depth maps without over-smoothing. Since our approach is data driven, the refinement can be targeted to a specific class of objects by employing a corresponding training set. In our experiments, we show that this leads to additional improvements in recovering depth maps of human faces.

[1] Jonathan T. Barron,et al. A category-level 3-D object dataset: Putting the Kinect to work , 2011, ICCV Workshops.

[2] S. Frick,et al. Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[3] Toby Sharp,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[4] Andrew W. Fitzgibbon,et al. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[5] Paolo Favaro,et al. Recovering thin structures via nonlocal-means regularization with application to depth from defocus , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[7] Carsten Rother,et al. Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation , 2013, NIPS.

[8] Stephen Lin,et al. Shading-Based Shape Refinement of RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Pushmeet Kohli,et al. When Can We Use KinectFusion for Ground Truth Acquisition , 2012 .

[10] Jonathan T. Barron,et al. A category-level 3-D object dataset: Putting the Kinect to work , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[11] Ruigang Yang,et al. Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Michael Elad,et al. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[13] Ruigang Yang,et al. Fusion of time-of-flight depth and stereo for high accuracy depth maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Ming-Yu Liu,et al. Joint Geodesic Upsampling of Depth Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Martin Kleinsteuber,et al. A Joint Intensity and Depth Co-sparse Analysis Model for Depth Map Super-resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[16] Horst Bischof,et al. Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation , 2013, 2013 IEEE International Conference on Computer Vision.

[17] Rama Chellappa,et al. Cross-View Action Recognition via a Transferable Dictionary Pair , 2012, BMVC.

[18] Thomas S. Huang,et al. Coupled Dictionary Training for Image Super-Resolution , 2012, IEEE Transactions on Image Processing.

[19] Dani Lischinski,et al. Joint bilateral upsampling , 2007, ACM Trans. Graph..

[20] In-So Kweon,et al. High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination , 2013, 2013 IEEE International Conference on Computer Vision.

[21] Jian Sun,et al. Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Richard Szeliski,et al. High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[23] Michael S. Brown,et al. High-Quality Depth Map Upsampling and Completion for RGB-D Cameras , 2014, IEEE Transactions on Image Processing.

[24] Cewu Lu,et al. Image smoothing via L0 gradient minimization , 2011, ACM Trans. Graph..

[25] Ivana Tosic,et al. Learning Joint Intensity-Depth Sparse Representations , 2012, IEEE Transactions on Image Processing.

[26] Kun Li,et al. Depth Recovery Using an Adaptive Color-Guided Auto-Regressive Model , 2012, ECCV.

[27] Michael S. Brown,et al. High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[28] Sebastian Thrun,et al. An Application of Markov Random Fields to Range Sensing , 2005, NIPS.

[29] Lifeng Sun,et al. Joint Example-Based Depth Map Super-Resolution , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[30] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.

[31] Harry Shum,et al. Image completion with structure propagation , 2005, ACM Trans. Graph..