A High Quality Depth Map Upsampling Method Robust to Misalignment of Depth and Color Boundaries

In recent years, fusion camera systems that consist of color cameras and Time-of-Flight (TOF) depth sensors have been popularly used due to its depth sensing capability at real-time frame rates. However, captured depth maps are limited in low resolution compared to the corresponding color images due to physical limitation of the TOF depth sensor. Most approaches to enhancing the resolution of captured depth maps depend on the implicit assumption that when neighboring pixels in the color image have similar values, they are also similar in depth. Although many algorithms have been proposed, they still yield erroneous results, especially when region boundaries in the depth map and the color image are not aligned. We therefore propose a novel kernel regression framework to generate the high quality depth map. Our proposed filter is based on the vector pointing similar pixels that represents the unit vector toward similar neighbors in the local region. The vectors are used to detect misaligned regions between color edges and depth edges. Unlike conventional kernel regression methods, our method properly handles misaligned regions by introducing the numerical analysis of the local structure into the kernel regression framework. Experimental comparisons with other data fusion techniques prove the superiority of the proposed algorithm.

[1]  Frédo Durand,et al.  A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach , 2006, ECCV.

[2]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ruigang Yang,et al.  Fusion of time-of-flight depth and stereo for high accuracy depth maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, SIGGRAPH 2007.

[5]  Thomas Martinetz,et al.  A facial feature tracker for human-computer interaction based on 3D Time-Of-Flight cameras , 2008, Int. J. Intell. Syst. Technol. Appl..

[6]  Peyman Milanfar,et al.  Kernel Regression for Image Processing and Reconstruction , 2007, IEEE Transactions on Image Processing.

[7]  Young Min Kim,et al.  Multi-view image and ToF sensor fusion for dense 3D reconstruction , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[8]  Thomas Martinetz,et al.  A facial feature tracker for human-computer interaction based on 3D TOF cameras , 2009 .

[9]  Baining Guo,et al.  Context-aware textures , 2007, TOGS.

[10]  Steve McLaughlin,et al.  Comparative study of textural analysis techniques to characterise tissue from intravascular ultrasound , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[11]  In-So Kweon,et al.  Adaptive Support-Weight Approach for Correspondence Search , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  R. Deriche Recursively Implementing the Gaussian and its Derivatives , 1993 .

[13]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  B. Curless New Methods for Surface Reconstruction from Range Images , 1997 .

[15]  Haixian Wang,et al.  Image Denoising Using Trivariate Shrinkage Filter in the Wavelet Domain and Joint Bilateral Filter in the Spatial Domain , 2009, IEEE Transactions on Image Processing.

[16]  A. K. Riemens,et al.  Multistep joint bilateral depth upsampling , 2009, Electronic Imaging.

[17]  Daeyoung Kim,et al.  High-quality depth map up-sampling robust to edge noise of range sensors , 2012, 2012 19th IEEE International Conference on Image Processing.

[18]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[19]  Joachim Hornegger,et al.  Gesture recognition with a Time-Of-Flight camera , 2008, Int. J. Intell. Syst. Technol. Appl..

[20]  B. Barenbrug,et al.  Improved depth propagation for 2D to 3D video conversion using key-frames , 2007 .

[21]  Michael S. Brown,et al.  High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[22]  Ruigang Yang,et al.  Spatial-Temporal Fusion for High Accuracy Depth Maps Using Dynamic MRFs , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Kwan H. Lee,et al.  Depth video based human model reconstruction resolving self-occlusion , 2010, IEEE Transactions on Consumer Electronics.

[24]  Neil A. Dodgson,et al.  Proceedings Ninth IEEE International Conference on Computer Vision , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Jan P. Allebach,et al.  Adaptive Bilateral Filter for Sharpness Enhancement and Noise Removal , 2007, ICIP.

[26]  Narendra Ahuja,et al.  Real-time O(1) bilateral filtering , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Olga Veksler,et al.  A Variable Window Approach to Early Vision , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Stephan Hussmann,et al.  Three-Dimensional TOF Robot Vision System , 2009, IEEE Transactions on Instrumentation and Measurement.

[29]  Fatih Porikli,et al.  Constant time O(1) bilateral filtering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[31]  Sebastian Thrun,et al.  An Application of Markov Random Fields to Range Sensing , 2005, NIPS.

[32]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[33]  Satoru Takahashi,et al.  Mobile robot control based on information of the scanning laser range sensor , 2010, 2010 11th IEEE International Workshop on Advanced Motion Control (AMC).

[34]  Wen-Nung Lie,et al.  2D to 3D video conversion with key-frame depth propagation and trilateral filtering , 2011 .

[35]  Masayuki Inaba,et al.  Robot vision system with a correlation chip for real-time tracking, optical flow and depth map generation , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[36]  Pierre Kornprobst,et al.  Bilateral Filtering , 2009 .

[37]  Nagaraj Nandhakumar,et al.  Object motion and structure recovery for robotic vision using scanning laser range sensors , 1997, IEEE Trans. Robotics Autom..

[38]  Frédo Durand,et al.  Flash photography enhancement via intrinsic relighting , 2004, SIGGRAPH 2004.

[39]  Hong-fei Zhang,et al.  Study on 3D laser scanning modeling method for Large-Scale history building , 2010, 2010 International Conference on Computer Application and System Modeling (ICCASM 2010).

[40]  Björn E. Ottersten,et al.  Pixel weighted average strategy for depth sensor data fusion , 2010, 2010 IEEE International Conference on Image Processing.

[41]  Sebastian Thrun,et al.  A Noise‐aware Filter for Real‐time Depth Upsampling , 2008 .