DeepToF: off-the-shelf real-time correction of multipath interference in time-of-flight imaging

Time-of-flight (ToF) imaging has become a widespread technique for depth estimation, allowing affordable off-the-shelf cameras to provide depth maps in real time. However, multipath interference (MPI) resulting from indirect illumination significantly degrades the captured depth. Most previous works have tried to solve this problem by means of complex hardware modifications or costly computations. In this work we avoid these approaches, and propose a new technique that corrects errors in depth caused by MPI that requires no camera modifications, and corrects depth in just 10 milliseconds per frame. By observing that most MPI information can be expressed as a function of the captured depth, we pose MPI removal as a convolutional approach, and model it using a convolutional neural network. In particular, given that the input and output data present similar structure, we base our network in an autoencoder, which we train in two stages: first, we use the encoder (convolution filters) to learn a suitable basis to represent corrupted range images; then, we train the decoder (deconvolution filters) to correct depth from the learned basis from synthetically generated scenes. This approach allows us to tackle the lack of reference data, by using a large-scale captured training set with corrupted depth to train the encoder, and a smaller synthetic training set with ground truth depth to train the corrector stage of the network, which we generate by using a physically-based, time-resolved rendering. We demonstrate and validate our method on both synthetic and real complex scenarios, using an off-the-shelf ToF camera, and with only the captured incorrect depth as input.

[1]  David Mumford,et al.  Statistics of range images , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[2]  Olaf Hellwich,et al.  Compensation for Multipath in ToF Camera Measurements Supported by Photometric Calibration and Environment Integration , 2013, ICVS.

[3]  Philip A. Chou,et al.  SPUMIC: Simultaneous phase unwrapping and multipath interference cancellation in time-of-flight cameras using spectral methods , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[4]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Michael Werman,et al.  The Quadratic-Chi Histogram Distance Family , 2010, ECCV.

[7]  Bruce Bugbee,et al.  A Mixture of Barium Sulfate and White Paint is a Low-Cost Substitute Reflectance Standard for Spectralon® , 2005 .

[8]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Pat Hanrahan,et al.  All-frequency shadows using non-linear wavelet lighting approximation , 2003, ACM Trans. Graph..

[10]  Diego Gutierrez,et al.  Femto-photography , 2013, ACM Trans. Graph..

[11]  Qionghai Dai,et al.  Decomposing Global Light Transport Using Time of Flight Imaging , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Reinhard Koch,et al.  Time-of-Flight sensor calibration for accurate range sensing , 2010, Comput. Vis. Image Underst..

[14]  David Mumford,et al.  Statistics of natural images and models , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[15]  Ramesh Raskar,et al.  Coded time of flight cameras , 2013, ACM Trans. Graph..

[16]  Manuel Mazo,et al.  Modeling and correction of multipath interference in time of flight cameras , 2014, Image Vis. Comput..

[17]  Jean-Yves Bouguet,et al.  Camera calibration toolbox for matlab , 2001 .

[18]  MOHIT GUPTA,et al.  Phasor Imaging , 2015, ACM Trans. Graph..

[19]  Matthew O'Toole,et al.  Temporal frequency probing for 5D transient analysis of global light transport , 2014, ACM Trans. Graph..

[20]  Yann LeCun,et al.  Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jonathan T. Barron,et al.  A category-level 3-D object dataset: Putting the Kinect to work , 2011, ICCV Workshops.

[22]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[23]  Chunhua Shen,et al.  Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Michael J. Cree,et al.  Separating true range measurements from multi-path and scattering interference in commercial range cameras , 2011, Electronic Imaging.

[26]  Nicu Sebe,et al.  Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[28]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[29]  Ramesh Raskar,et al.  Resolving Multi-path Interference in Time-of-Flight Imaging via Modulation Frequency Diversity and Sparse Regularization , 2014, Optics letters.

[30]  Mirko Schmidt,et al.  SRA: Fast Removal of General Multipath for ToF Sensors , 2014, ECCV.

[31]  Jürgen Schmidhuber,et al.  Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[32]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[33]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[35]  E. Patterson,et al.  Kubelka-Munk optical properties of a barium sulfate white reflectance standard. , 1977, Applied optics.

[36]  Guosheng Lin,et al.  Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Qionghai Dai,et al.  Resolving transient time profile in ToF imaging via log-sum sparse regularization. , 2015, Optics letters.

[38]  Stefan Fuchs,et al.  Multipath Interference Compensation in Time-of-Flight Camera Images , 2010, 2010 20th International Conference on Pattern Recognition.

[39]  Wolfgang Heidrich,et al.  Low-budget transient imaging using photonic mixer devices , 2013, ACM Trans. Graph..

[40]  Michael J. Cree,et al.  Closed-form inverses for the mixed pixel/multipath interference problem in AMCW lidar , 2012, Electronic Imaging.

[41]  Giljoo Nam,et al.  High-quality hyperspectral reconstruction using a spectral prior , 2017, ACM Trans. Graph..

[42]  Ting-Chun Wang,et al.  Learning-based view synthesis for light field cameras , 2016, ACM Trans. Graph..

[43]  Jonathan T. Barron,et al.  A category-level 3-D object dataset: Putting the Kinect to work , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[44]  Diego Gutierrez,et al.  Recent advances in transient imaging: A computer graphics and vision perspective , 2016, Vis. Informatics.

[45]  Alan L. Yuille,et al.  Towards unified depth and semantic prediction from a single image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Ramesh Raskar,et al.  Resolving Multipath Interference in Kinect: An Inverse Problem Approach , 2014, IEEE Sensors Journal.

[47]  Reinhard Klein,et al.  Solving trigonometric moment problems for fast transient imaging , 2015, ACM Trans. Graph..

[48]  Wei Xiong,et al.  Stacked Convolutional Denoising Auto-Encoders for Feature Representation , 2017, IEEE Transactions on Cybernetics.

[49]  Diego Gutierrez,et al.  A framework for transient rendering , 2014, ACM Trans. Graph..

[50]  Ramesh Raskar,et al.  A light transport model for mitigating multipath interference in Time-of-flight sensors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).