论文信息 - Gated2Depth: Real-Time Dense Lidar From Gated Images

Gated2Depth: Real-Time Dense Lidar From Gated Images

We present an imaging framework which converts three images from a gated camera into high-resolution depth maps with depth accuracy comparable to pulsed lidar measurements. Existing scanning lidar systems achieve low spatial resolution at large ranges due to mechanically-limited angular sampling rates, restricting scene understanding tasks to close-range clusters with dense sampling. Moreover, today's pulsed lidar scanners suffer from high cost, power consumption, large form-factors, and they fail in the presence of strong backscatter. We depart from point scanning and demonstrate that it is possible to turn a low-cost CMOS gated imager into a dense depth camera with at least 80m range - by learning depth from three gated images. The proposed architecture exploits semantic context across gated slices, and is trained on a synthetic discriminator loss without the need of dense depth labels. The proposed replacement for scanning lidar systems is real-time, handles back-scatter and provides dense depth at long ranges. We validate our approach in simulation and on real-world data acquired over 4,000km driving in northern Europe. Data and code are available at https://github.com/gruberto/Gated2Depth.

[1] Dieter Fox,et al. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Alex Kendall,et al. End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[4] Edoardo Charbon,et al. A 160×128 single-photon image sensor with on-pixel 55ps 10b time-to-digital converter , 2011, 2011 IEEE International Solid-State Circuits Conference.

[5] Heiko Hirschmüller,et al. Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[6] P. Heckman,et al. 2.7 - Underwater optical range gating , 1967 .

[7] Jörg Stückler,et al. Semi-Supervised Deep Learning for Monocular Depth Map Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Robert Lange,et al. 3D time-of-flight distance measurement with custom solid-state image sensors in CMOS/CCD-technology , 2006 .

[9] P. B. Coates,et al. The correction for photon `pile-up' in the measurement of radiative lifetimes , 1968 .

[10] Noah Snavely,et al. Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Sebastian Nowozin,et al. Bayesian Time-of-Flight for Realtime Shape, Illumination and Albedo , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Qiao Wang,et al. VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] J J Koenderink,et al. Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[14] Gustavo Carneiro,et al. Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue , 2016, ECCV.

[15] Jianxiong Xiao,et al. SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] James D. Spinhirne,et al. Compact Eye Safe Lidar Systems , 1995 .

[17] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Zhao Chen,et al. Estimating Depth from RGB and Sparse Sensing , 2018, ECCV.

[19] WetzsteinGordon,et al. Computational imaging with multi-camera time-of-flight systems , 2016 .

[20] Nicu Sebe,et al. Unsupervised Adversarial Depth Estimation Using Cycled Generative Networks , 2018, 2018 International Conference on 3D Vision (3DV).

[21] Jens Busck,et al. Underwater 3-D optical imaging with a gated viewing laser radar , 2005 .

[22] Yang Gao,et al. End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] P. Andersson. Long-range three-dimensional imaging using range-gated laser radar images , 2006 .

[24] Irfan A. Essa,et al. Efficient Hierarchical Graph-Based Segmentation of RGBD Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Karen O. Egiazarian,et al. Practical Poissonian-Gaussian Noise Modeling and Fitting for Single-Image Raw-Data , 2008, IEEE Transactions on Image Processing.

[26] Alberto Tosi,et al. Automotive Three-Dimensional Vision Through a Single-Photon Counting SPAD Camera , 2016, IEEE Transactions on Intelligent Transportation Systems.

[27] Vladlen Koltun,et al. Playing for Benchmarks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28] Oisin Mac Aodha,et al. Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30] Gordon Wetzstein,et al. Doppler time-of-flight imaging , 2015, SIGGRAPH Emerging Technologies.

[31] Yoav Grauer,et al. Active gated imaging in driver assistance system , 2014 .

[32] Vladlen Koltun,et al. Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[33] Martin Laurenzis,et al. Long-range three-dimensional active imaging with superresolution depth mapping. , 2007, Optics letters.

[34] Matthew O'Toole,et al. Primal-dual coding to probe light transport , 2012, ACM Trans. Graph..

[35] Sebastian Nowozin,et al. Dynamic Time-of-Flight , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Radu Horaud,et al. Time-of-Flight Cameras: Principles, Methods and Applications , 2012 .

[38] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[39] Yong-Sheng Chen,et al. Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40] Alan L. Yuille,et al. Rethinking Monocular Depth Estimation with Adversarial Training , 2018, ArXiv.

[41] Antonio M. López,et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] George M. Williams. Optimization of eyesafe avalanche photodiode lidar for automobile safety and autonomous navigation systems , 2017 .

[43] Thomas Brox,et al. DeMoN: Depth and Motion Network for Learning Monocular Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Reinhard Koch,et al. Time‐of‐Flight Cameras in Computer Graphics , 2010, Comput. Graph. Forum.

[45] William Whittaker,et al. Epipolar time-of-flight imaging , 2017, ACM Trans. Graph..

[46] Radu Horaud,et al. Time-of-Flight Cameras , 2012, SpringerBriefs in Computer Science.

[47] Matthew O'Toole,et al. Temporal frequency probing for 5D transient analysis of global light transport , 2014, ACM Trans. Graph..

[48] Bingbing Ni,et al. RGBD-HuDaAct: A color-depth video database for human daily activity recognition , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[49] Sertac Karaman,et al. Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[50] Gordon Wetzstein,et al. Deep End-to-End Time-of-Flight Imaging , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51] J. Busck,et al. Gated viewing and high-accuracy three-dimensional laser radar. , 2004, Applied optics.

[52] Nassir Navab,et al. Deeper Depth Prediction with Fully Convolutional Residual Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[53] W. Brockherde,et al. CMOS Imager With 1024 SPADs and TDCs for Single-Photon Timing and 3-D Time-of-Flight , 2014, IEEE Journal of Selected Topics in Quantum Electronics.

[54] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[55] Matthew Johnson-Roberson,et al. Driving in the Matrix: Can virtual worlds replace human-generated annotations for real world tasks? , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[56] Weifeng Chen,et al. Single-Image Depth Perception in the Wild , 2016, NIPS.

[57] Martin Laurenzis,et al. Three-dimensional range-gated imaging at infrared wavelengths with super-resolution depth mapping , 2009, Defense + Commercial Sensing.

[58] R. Popovic,et al. First fully integrated 2-D array of single-photon detectors in standard CMOS technology , 2003, IEEE Photonics Technology Letters.

[59] Brent Schwarz,et al. LIDAR: Mapping the world in 3D , 2010 .

[60] Andrew Zisserman,et al. Feature Based Methods for Structure and Motion Estimation , 1999, Workshop on Vision Algorithms.

[61] Ashutosh Saxena,et al. Learning Depth from Single Monocular Images , 2005, NIPS.

[62] D. Young,et al. Geiger-Mode Avalanche Photodiodes for Three-Dimensional Imaging , 2002 .

[63] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[64] Luigi di Stefano,et al. Geometry meets semantics for semi-supervised monocular depth estimation , 2018, ACCV.

[65] Richard Szeliski,et al. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[66] Fawzi Nashashibi,et al. Sparse and Dense Data with CNNs: Depth Completion and Semantic Segmentation , 2018, 2018 International Conference on 3D Vision (3DV).

[67] P.-A. Besse,et al. Design and characterization of a CMOS 3-D image sensor based on single photon avalanche diodes , 2005, IEEE Journal of Solid-State Circuits.

[68] Richard Szeliski,et al. High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[69] Sertac Karaman,et al. Self-Supervised Sparse-to-Dense: Self-Supervised Depth Completion from LiDAR and Monocular Camera , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[70] Andrew W. Fitzgibbon,et al. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[71] Wang Xinwei,et al. Triangular-range-intensity profile spatial-correlation method for 3D super-resolution range-gated imaging. , 2013, Applied optics.