Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception

cars must detect vehicles, pedestrians, and other traffic participants accurately to operate safely. Small, far-away, or highly occluded objects are par-ticularly challenging because there is limited information in the LiDAR point clouds for detecting them. To address this challenge, we leverage valuable information from the past: in particular, data collected in past traversals of the same scene. We posit that these past data, which are typically discarded, provide rich contextual information for disambiguating the above-mentioned challenging cases. To this end, we propose a novel, end-to-end trainable H IND SIGHT framework to extract this contextual information from past traversals and store it in an easy-to-query data structure, which can then be leveraged to aid future 3D object detection of the same scene. We show that this framework is compatible with most modern 3D detection architectures and can substantially improve their average precision on multiple autonomous driving datasets, most notably by more than 300% on the challenging cases. Our code is available at

[1]  Jiquan Ngiam,et al.  3D-MAN: 3D Multi-frame Attention Network for Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xiaogang Wang,et al.  From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jin Sun,et al.  Hidden Footprints: Learning Contextual Walkability from 3D Human Trails , 2020, ECCV.

[4]  R. Urtasun,et al.  Learning Lane Graph Representations for Motion Forecasting , 2020, ECCV.

[5]  Thomas Funkhouser,et al.  An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds , 2020, ECCV.

[6]  R. Urtasun,et al.  PnPNet: End-to-End Perception and Prediction With Tracking in the Loop , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yan Wang,et al.  Train in Germany, Test in the USA: Making 3D Object Detectors Generalize , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yanan Sun,et al.  3DSSD: Point-Based 3D Single Stage Object Detector , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Xiaogang Wang,et al.  PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Alex H. Lang,et al.  PointPainting: Sequential Fusion for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Tiberiu T. Cocias,et al.  A survey of deep learning techniques for autonomous driving , 2019, J. Field Robotics.

[12]  Yan Wang,et al.  Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving , 2019, ICLR.

[13]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Andreas Geiger,et al.  Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art , 2017, Found. Trends Comput. Graph. Vis..

[15]  Yin Zhou,et al.  End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds , 2019, CoRL.

[16]  Silvio Savarese,et al.  4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Bin Yang,et al.  HDNET: Exploiting HD Maps for 3D Object Detection , 2018, CoRL.

[20]  Bo Li,et al.  SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[21]  Bin Yang,et al.  Deep Continuous Fusion for Multi-sensor 3D Object Detection , 2018, ECCV.

[22]  Victor Talpaert,et al.  Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps , 2018, ECCV Workshops.

[23]  Wei Zhan,et al.  Fusing Bird’s Eye View LIDAR Point Cloud and Front View Camera Image for 3D Object Detection , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[24]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Ingmar Posner,et al.  Driven to Distraction: Self-Supervised Distractor Learning for Robust Monocular Visual Odometry in Urban Environments , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[30]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[31]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Bo Li,et al.  3D fully convolutional network for vehicle detection in point cloud , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Dushyant Rao,et al.  Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Xiaolong Hu,et al.  Autonomous Driving in the iCity—HD Maps as a Key Challenge of the Automotive Industry , 2016 .

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[38]  Alexei A. Efros,et al.  People Watching: Human Actions as a Cue for Single View Geometry , 2012, International Journal of Computer Vision.

[39]  Emilio Frazzoli,et al.  Synthetic 2D LIDAR for precise vehicle localization in 3D urban environment , 2013, 2013 IEEE International Conference on Robotics and Automation.

[40]  Alexei A. Efros,et al.  Scene Semantics from Long-Term Observation of People , 2012, ECCV.

[41]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.