High-quality Kinect depth filtering for real-time 3D telepresence

3D telepresence is a next-generation multimedia application, offering remote users an immersive and natural video-conferencing environment with real-time 3D graphics. Kinect sensor, a consumer-grade range camera, facilitates the implementation of some recent 3D telepresence systems. However, conventional data filtering methods are insufficient to handle Kinect depth error because such error is quantized rather than just randomly-distributed. Hence, one could often observe large irregularly-shaped patches of pixels that receive the same depth values from Kinect. To enhance visual quality in 3D telepresence, we propose a novel depth data filtering method for Kinect by means of multi-scale and direction-aware support windows. In addition, we develop a GPU-based CUDA implementation that can perform real-time depth filtering. Results from the experiments show that our method can reconstruct hole-free surfaces that are smoother and less bumpy compared to existing methods like bilateral filtering.

[1]  Henry Fuchs,et al.  Encumbrance-free telepresence system with real-time 3D capture and display using commodity depth cameras , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[2]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[3]  Wenjun Zeng,et al.  Kinect-like depth denoising , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[4]  Andrew P. Witkin,et al.  Scale-space filtering: A new approach to multi-scale description , 1984, ICASSP.

[5]  Luc Van Gool,et al.  Blue-c: a spatially immersive display and 3D video portal for telepresence , 2003, IPT/EGVE.

[6]  Greg Welch,et al.  The office of the future: a unified approach to image-based modeling and spatially immersive displays , 1998, SIGGRAPH.

[7]  Stefan Maierhofer,et al.  Consolidation of multiple depth maps , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[8]  Xiaojin Gong,et al.  Guided inpainting and filtering for Kinect depth maps , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[9]  Bernd Jähne,et al.  Spatio-Temporal Image Processing , 1993, Lecture Notes in Computer Science.

[10]  Henry Fuchs,et al.  Reducing interference between multiple structured light depth sensors using motion , 2012, 2012 IEEE Virtual Reality Workshops (VRW).

[11]  Shahram Izadi,et al.  Modeling Kinect Sensor Noise for Improved 3D Reconstruction and Tracking , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[12]  Chi-Keung Tang,et al.  Simultaneous Image Denoising and Compression by Multiscale 2D Tensor Voting , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Markus H. Gross,et al.  FreeCam: A Hybrid Camera System for Interactive Free-Viewpoint Video , 2011, VMV.

[14]  B. J Hne,et al.  Spatio - temporal Image Processing: Theory and Scientific Applications , 1991 .