Stereo+Kinect for High Resolution Stereo Correspondences

In this work, we combine the complementary depth sensors Kinect and stereo image matching to obtain high quality correspondences. Our goal is to obtain a dense disparity map at the spatial and depth resolution of the stereo cameras (4-12 MP). We propose a global optimization scheme, where both the data and smoothness costs are derived using sensor confidences and low resolution geometry from Kinect. A spatially varying search range is used to limit the number of potential disparities at each pixel. The smoothness prior is Based on available low resolution depth from Kinect rather than image gradients, thus performing better in both textured areas with smooth depth and texture-less areas with depth gradient. We also propose a spatially varying smoothness weight to better handle occlusion areas, and the relative contribution of the two energy terms. We demonstrate how the two sensors can be effectively fused to obtain correct scene depth in ambiguous areas, as well as fine structural details in textured areas.

[1]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[3]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Jian Sun,et al.  Guided Image Filtering , 2010, ECCV.

[5]  Sebastian Thrun,et al.  3D shape scanning with a time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Sebastian Thrun,et al.  LidarBoost: Depth superresolution for ToF 3D shape scanning , 2009, CVPR.

[7]  John G. Apostolopoulos,et al.  Fusion of active and passive sensors for fast 3D capture , 2010, 2010 IEEE International Workshop on Multimedia Signal Processing.

[8]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[9]  Marc Alexa,et al.  Depth Imaging by Combining Time-of-Flight and On-Demand Stereo , 2009, Dyn3D.

[10]  Ruigang Yang,et al.  Global stereo matching leveraged by sparse ground control points , 2011, CVPR 2011.

[11]  Michael S. Brown,et al.  High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[12]  Sebastian Thrun,et al.  An Application of Markov Random Fields to Range Sensing , 2005, NIPS.

[13]  Andrea Fusiello,et al.  Quasi-Euclidean uncalibrated epipolar rectification , 2008, 2008 19th International Conference on Pattern Recognition.

[14]  Ruigang Yang,et al.  Spatial-Temporal Fusion for High Accuracy Depth Maps Using Dynamic MRFs , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.