Accurate real-time occlusion for mixed reality

Properly handling occlusion between real and virtual objects is an important property for any mixed reality (MR) system. Existing methods have typically required known geometry of the real objects in the scene, either specified manually, or reconstructed using a dense mapping algorithm. This limits the situations in which they can be applied. Modern RGBD cameras are cheap and widely available, but the depth information they provide is typically too noisy and incomplete to use directly to provide quality results. In this paper, a method is proposed which makes use of both the colour and depth information provided by an RGBD camera to provide improved occlusion. This method, Cost Volume Filtering Occlusion, is capable of running in real time, and can also handle occlusion of virtual objects by dynamic, moving objects - such as the user's hands. The method operates on individual RGBD frames as they arrive, meaning it can function immediately in unknown environments, and respond appropriately to sudden changes. The accuracy of the presented method is quantified using a novel approach capable of comparing the results of algorithms such as this to dense SLAM-based approaches. The proposed approach is shown to be capable of producing superior results to both previous image-based approaches and dense RGBD reconstruction, at lower computational cost.

[1]  Hui Lin,et al.  Depth image enhancement for Kinect using region growing and bilateral filter , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[2]  Tom Drummond,et al.  Sensor fusion and occlusion refinement for tablet-based AR , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[3]  Ruigang Yang,et al.  Automatic Real-Time Video Matting Using Time-of-Flight Camera and Multichannel Poisson Equations , 2012, International Journal of Computer Vision.

[4]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[5]  Gordon Wetzstein,et al.  Radiometric Compensation through Inverse Light Transport , 2007 .

[6]  Alvy Ray Smith Alpha and the History of Digital Compositing , 1995 .

[7]  Matthias M. Wloka,et al.  Resolving occlusion in augmented reality , 1995, I3D '95.

[8]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[10]  Tobias Höllerer,et al.  Resolving multiple occluded layers in augmented reality , 2003, The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, 2003. Proceedings..

[11]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[12]  Miao Liao,et al.  Joint depth and alpha matte optimization via fusion of stereo and time-of-flight sensor , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Sebastian Thrun,et al.  A Noise‐aware Filter for Real‐time Depth Upsampling , 2008 .

[14]  James F. Blinn,et al.  Blue screen matting , 1996, SIGGRAPH.

[15]  Ruigang Yang,et al.  Automatic Natural Video Matting with Depth , 2007, 15th Pacific Conference on Computer Graphics and Applications (PG'07).

[16]  Stephen R. Ellis,et al.  Localization of Virtual Objects in the Near Visual Field , 1998, Hum. Factors.

[17]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[18]  Vincent Lepetit,et al.  A semi-automatic method for resolving occlusion in augmented reality , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[19]  Michael F. Cohen,et al.  Image and Video Matting: A Survey , 2007, Found. Trends Comput. Graph. Vis..

[20]  Minh N. Do,et al.  CuteChat: a lightweight tele-immersive video chat system , 2011, MM '11.

[21]  Yen-Lin Chen,et al.  Edge Snapping-Based Depth Enhancement for Dynamic Occlusion Handling in Augmented Reality , 2016, 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[22]  David E. Breen,et al.  Interactive Occlusion and Automatic Object Placement for Augmented Reality , 1996, Comput. Graph. Forum.

[23]  Chongyu Chen,et al.  A color-guided, region-adaptive and depth-selective unified framework for Kinect depth recovery , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).

[24]  Ruigang Yang,et al.  TofCut: Towards Robust Real-time Foreground Extraction using Time-of-flight Camera , 2016 .

[25]  Dieter Schmalstieg,et al.  Interactive Focus and Context Visualization for Augmented Reality , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[26]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Guang-Zhong Yang,et al.  pq-space Based Non-Photorealistic Rendering for Augmented Reality , 2007, MICCAI.

[28]  David Salesin,et al.  A Bayesian approach to digital matting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[29]  F. Durand,et al.  Flash photography enhancement via intrinsic relighting , 2004, ACM Trans. Graph..

[30]  Ryan Crabb,et al.  Real-time foreground segmentation via range and color imaging , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[31]  Olaf Kähler,et al.  Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices , 2015, IEEE Transactions on Visualization and Computer Graphics.

[32]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[33]  Kiyoharu Aizawa,et al.  Depth map up-sampling using cost-volume filtering , 2013, IVMSP 2013.

[34]  Marie-Odile Berger Resolving occlusion in augmented reality: a contour based approach without 3D reconstruction , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35]  Francisco José Madrid-Cuevas,et al.  Automatic generation and detection of highly reliable fiducial markers under occlusion , 2014, Pattern Recognit..

[36]  Steven K. Feiner,et al.  Knowledge-based augmented reality , 1993, CACM.

[37]  Yuichi Ohta,et al.  Visualization methods for outdoor see-through vision , 2005, ICAT '05.

[38]  Naokazu Yokoya,et al.  A stereoscopic video see-through augmented reality system based on real-time vision-based registration , 2000, Proceedings IEEE Virtual Reality 2000 (Cat. No.00CB37048).

[39]  Pushmeet Kohli,et al.  A perceptually motivated online benchmark for image matting , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Rodolfo S. Lima,et al.  GPU-efficient recursive filtering and summed-area tables , 2011, SA '11.

[41]  Tobias Höllerer,et al.  Online environment model estimation for augmented reality , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[42]  Shahram Izadi,et al.  Modeling Kinect Sensor Noise for Improved 3D Reconstruction and Tracking , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.