Gated2Depth: Real-Time Dense Lidar From Gated Images

We present an imaging framework which converts three images from a gated camera into high-resolution depth maps with depth accuracy comparable to pulsed lidar measurements. Existing scanning lidar systems achieve low spatial resolution at large ranges due to mechanically-limited angular sampling rates, restricting scene understanding tasks to close-range clusters with dense sampling. Moreover, today's pulsed lidar scanners suffer from high cost, power consumption, large form-factors, and they fail in the presence of strong backscatter. We depart from point scanning and demonstrate that it is possible to turn a low-cost CMOS gated imager into a dense depth camera with at least 80m range - by learning depth from three gated images. The proposed architecture exploits semantic context across gated slices, and is trained on a synthetic discriminator loss without the need of dense depth labels. The proposed replacement for scanning lidar systems is real-time, handles back-scatter and provides dense depth at long ranges. We validate our approach in simulation and on real-world data acquired over 4,000km driving in northern Europe. Data and code are available at https://github.com/gruberto/Gated2Depth.

[1]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[4]  Edoardo Charbon,et al.  A 160×128 single-photon image sensor with on-pixel 55ps 10b time-to-digital converter , 2011, 2011 IEEE International Solid-State Circuits Conference.

[5]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  P. Heckman,et al.  2.7 - Underwater optical range gating , 1967 .

[7]  Jörg Stückler,et al.  Semi-Supervised Deep Learning for Monocular Depth Map Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Robert Lange,et al.  3D time-of-flight distance measurement with custom solid-state image sensors in CMOS/CCD-technology , 2006 .

[9]  P. B. Coates,et al.  The correction for photon `pile-up' in the measurement of radiative lifetimes , 1968 .

[10]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Sebastian Nowozin,et al.  Bayesian Time-of-Flight for Realtime Shape, Illumination and Albedo , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[14]  Gustavo Carneiro,et al.  Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue , 2016, ECCV.

[15]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  James D. Spinhirne,et al.  Compact Eye Safe Lidar Systems , 1995 .

[17]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Zhao Chen,et al.  Estimating Depth from RGB and Sparse Sensing , 2018, ECCV.

[19]  WetzsteinGordon,et al.  Computational imaging with multi-camera time-of-flight systems , 2016 .

[20]  Nicu Sebe,et al.  Unsupervised Adversarial Depth Estimation Using Cycled Generative Networks , 2018, 2018 International Conference on 3D Vision (3DV).

[21]  Jens Busck,et al.  Underwater 3-D optical imaging with a gated viewing laser radar , 2005 .

[22]  Yang Gao,et al.  End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  P. Andersson Long-range three-dimensional imaging using range-gated laser radar images , 2006 .

[24]  Irfan A. Essa,et al.  Efficient Hierarchical Graph-Based Segmentation of RGBD Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Karen O. Egiazarian,et al.  Practical Poissonian-Gaussian Noise Modeling and Fitting for Single-Image Raw-Data , 2008, IEEE Transactions on Image Processing.

[26]  Alberto Tosi,et al.  Automotive Three-Dimensional Vision Through a Single-Photon Counting SPAD Camera , 2016, IEEE Transactions on Intelligent Transportation Systems.

[27]  Vladlen Koltun,et al.  Playing for Benchmarks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Oisin Mac Aodha,et al.  Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30]  Gordon Wetzstein,et al.  Doppler time-of-flight imaging , 2015, SIGGRAPH Emerging Technologies.

[31]  Yoav Grauer,et al.  Active gated imaging in driver assistance system , 2014 .

[32]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[33]  Martin Laurenzis,et al.  Long-range three-dimensional active imaging with superresolution depth mapping. , 2007, Optics letters.

[34]  Matthew O'Toole,et al.  Primal-dual coding to probe light transport , 2012, ACM Trans. Graph..

[35]  Sebastian Nowozin,et al.  Dynamic Time-of-Flight , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Radu Horaud,et al.  Time-of-Flight Cameras: Principles, Methods and Applications , 2012 .

[38]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Yong-Sheng Chen,et al.  Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Alan L. Yuille,et al.  Rethinking Monocular Depth Estimation with Adversarial Training , 2018, ArXiv.

[41]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  George M. Williams Optimization of eyesafe avalanche photodiode lidar for automobile safety and autonomous navigation systems , 2017 .

[43]  Thomas Brox,et al.  DeMoN: Depth and Motion Network for Learning Monocular Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Reinhard Koch,et al.  Time‐of‐Flight Cameras in Computer Graphics , 2010, Comput. Graph. Forum.

[45]  William Whittaker,et al.  Epipolar time-of-flight imaging , 2017, ACM Trans. Graph..

[46]  Radu Horaud,et al.  Time-of-Flight Cameras , 2012, SpringerBriefs in Computer Science.

[47]  Matthew O'Toole,et al.  Temporal frequency probing for 5D transient analysis of global light transport , 2014, ACM Trans. Graph..

[48]  Bingbing Ni,et al.  RGBD-HuDaAct: A color-depth video database for human daily activity recognition , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[49]  Sertac Karaman,et al.  Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[50]  Gordon Wetzstein,et al.  Deep End-to-End Time-of-Flight Imaging , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  J. Busck,et al.  Gated viewing and high-accuracy three-dimensional laser radar. , 2004, Applied optics.

[52]  Nassir Navab,et al.  Deeper Depth Prediction with Fully Convolutional Residual Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[53]  W. Brockherde,et al.  CMOS Imager With 1024 SPADs and TDCs for Single-Photon Timing and 3-D Time-of-Flight , 2014, IEEE Journal of Selected Topics in Quantum Electronics.

[54]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[55]  Matthew Johnson-Roberson,et al.  Driving in the Matrix: Can virtual worlds replace human-generated annotations for real world tasks? , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[56]  Weifeng Chen,et al.  Single-Image Depth Perception in the Wild , 2016, NIPS.

[57]  Martin Laurenzis,et al.  Three-dimensional range-gated imaging at infrared wavelengths with super-resolution depth mapping , 2009, Defense + Commercial Sensing.

[58]  R. Popovic,et al.  First fully integrated 2-D array of single-photon detectors in standard CMOS technology , 2003, IEEE Photonics Technology Letters.

[59]  Brent Schwarz,et al.  LIDAR: Mapping the world in 3D , 2010 .

[60]  Andrew Zisserman,et al.  Feature Based Methods for Structure and Motion Estimation , 1999, Workshop on Vision Algorithms.

[61]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[62]  D. Young,et al.  Geiger-Mode Avalanche Photodiodes for Three-Dimensional Imaging , 2002 .

[63]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[64]  Luigi di Stefano,et al.  Geometry meets semantics for semi-supervised monocular depth estimation , 2018, ACCV.

[65]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[66]  Fawzi Nashashibi,et al.  Sparse and Dense Data with CNNs: Depth Completion and Semantic Segmentation , 2018, 2018 International Conference on 3D Vision (3DV).

[67]  P.-A. Besse,et al.  Design and characterization of a CMOS 3-D image sensor based on single photon avalanche diodes , 2005, IEEE Journal of Solid-State Circuits.

[68]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[69]  Sertac Karaman,et al.  Self-Supervised Sparse-to-Dense: Self-Supervised Depth Completion from LiDAR and Monocular Camera , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[70]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[71]  Wang Xinwei,et al.  Triangular-range-intensity profile spatial-correlation method for 3D super-resolution range-gated imaging. , 2013, Applied optics.