Multi-Modal 3D Object Detection in Autonomous Driving: A Survey

In this survey, we first introduce the background of popular sensors used for self-driving, their data properties, and the corresponding object detection algorithms. Next, we discuss existing datasets that can be used for evaluating multi-modal 3D object detection algorithms. Then we present a review of multi-modal fusion based 3D detection networks, taking a close look at their fusion stage, fusion input and fusion granularity, and how these design choices evolve with time and technology. After the review, we discuss open challenges as well as possible solutions. We hope that this survey can help researchers to get familiar with the field and embark on investigations in the area of multi-modal 3D object detection.

[1]  Steven L. Waslander,et al.  Dense Voxel Fusion for 3D Object Detection , 2022, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[2]  Vladlen Koltun,et al.  Enhancing Photorealism Enhancement , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Zequn Jie,et al.  MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection , 2022, ArXiv.

[4]  Xiatian Zhu,et al.  DeepInteraction: 3D Object Detection via Modality Interaction , 2022, NeurIPS.

[5]  Xiaofei He,et al.  Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph , 2022, ECCV.

[6]  Shiquan Zhang,et al.  AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection , 2022, ArXiv.

[7]  Jian Sun,et al.  PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images , 2022, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Huizi Mao,et al.  BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Philipp Krahenbuhl,et al.  Cross-view Transformers for real-time Map-view Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Junjie Huang,et al.  BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection , 2022, ArXiv.

[11]  Chiew-Lan Tai,et al.  TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Xiaopei Wu,et al.  Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jian Sun,et al.  PETR: Position Embedding Transformation for Multi-View 3D Object Detection , 2022, ECCV.

[14]  Bolei Zhou,et al.  AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection , 2022, IJCAI.

[15]  Yihan Hu,et al.  AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds , 2021, AAAI.

[16]  Hang Zhao,et al.  Embracing Single Stride 3D Object Detector with Sparse Transformer , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ulrich Neumann,et al.  Behind the Curtain: Learning Occluded Shapes for 3D Object Detection , 2021, AAAI.

[18]  Yu Zhang,et al.  VPFNet: Improving 3D Object Detection With Virtual Point Based LiDAR and Stereo Data Fusion , 2021, IEEE Transactions on Multimedia.

[19]  Dinesh Manocha,et al.  M3DETR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[20]  Long Chen,et al.  Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A Review , 2020, IEEE Transactions on Intelligent Transportation Systems.

[21]  Shiquan Zhang,et al.  Deformable Feature Aggregation for Dynamic Multi-modal 3D Object Detection , 2022, ECCV.

[22]  Jiaya Jia,et al.  Scaling up Kernels in 3D CNNs , 2022, ArXiv.

[23]  James Hays,et al.  Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting , 2023, NeurIPS Datasets and Benchmarks.

[24]  Minzhe Niu,et al.  Voxel Transformer for 3D Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Hongsheng Li,et al.  LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Jean-Emmanuel Deschaud,et al.  KITTI-CARLA: a KITTI-like dataset generated by CARLA Simulator , 2021, ArXiv.

[27]  Rares Ambrus,et al.  Is Pseudo-Lidar needed for Monocular 3D Object detection? , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Jianmin Ji,et al.  Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting , 2021, ACM Multimedia.

[29]  Shengfeng He,et al.  Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Xiaokang Yang,et al.  PointAugmenting: Cross-Modal Augmentation for 3D Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Hangbin Wu,et al.  Deep Neural Network Based Vehicle and Pedestrian Detection for Autonomous Driving: A Survey , 2021, IEEE Transactions on Intelligent Transportation Systems.

[32]  Erran L. Li,et al.  Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jiwen Lu,et al.  Objects are Different: Flexible Monocular 3D Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Rares Ambrus,et al.  Geometric Unsupervised Domain Adaptation for Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Xuan Xiong,et al.  RangeDet: In Defense of Range View for LiDAR-based 3D Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Jiquan Ngiam,et al.  Pseudo-labeling for Scalable 3D Object Detection , 2021, ArXiv.

[37]  Wengang Zhou,et al.  Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection , 2020, AAAI.

[38]  H. Qi,et al.  CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[39]  Umit Ozguner,et al.  Faraway-Frustum: Dealing with Lidar Sparsity for 3D Object Detection using Fusion , 2020, 2021 IEEE International Intelligent Transportation Systems Conference (ITSC).

[40]  Philipp Krähenbühl,et al.  Center-based 3D Object Detection and Tracking , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Ming Liu,et al.  FuseSeg: Semantic Segmentation of Urban Scenes Based on RGB and Thermal Data Fusion , 2020, IEEE Transactions on Automation Science and Engineering.

[42]  Edilson de Aguiar,et al.  Evaluating the Limits of a LiDAR for an Autonomous Driving Localization , 2020, IEEE Transactions on Intelligent Transportation Systems.

[43]  Xiaogang Wang,et al.  From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Klaus C. J. Dietmayer,et al.  Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges , 2019, IEEE Transactions on Intelligent Transportation Systems.

[45]  Alexander Carballo,et al.  Autonomous Driving in Adverse Weather Conditions: A Survey , 2021, ArXiv.

[46]  Zhe Wang,et al.  Multi-Modality Cut and Paste for 3D Object Detection , 2020, ArXiv.

[47]  D. Kissinger,et al.  IoT-Ready Millimeter-Wave Radar Sensors , 2020, 2020 IEEE Global Conference on Artificial Intelligence and Internet of Things (GCAIoT).

[48]  Jun Wang,et al.  Surrounding Vehicle Detection Using an FPGA Panoramic Camera and Deep CNNs , 2020, IEEE Transactions on Intelligent Transportation Systems.

[49]  Long Chen,et al.  Multi-View Adaptive Fusion Network for 3D Object Detection , 2020, ArXiv.

[50]  J. Choi,et al.  GRIF Net: Gated Region of Interest Fusion Network for Robust 3D Object Detection from Radar Point Cloud and Monocular Image , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[51]  Ian Goodfellow,et al.  Generative adversarial networks , 2020, Commun. ACM.

[52]  Andrew M. Wallace,et al.  RADIATE: A Radar Dataset for Automotive Perception in Bad Weather , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[53]  Miaohui Wang,et al.  An Advanced LiDAR Point Cloud Sequence Coding Scheme for Autonomous Driving , 2020, ACM Multimedia.

[54]  Hayder Radha,et al.  CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[55]  Shiliang Pu,et al.  RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range Image Representation , 2020, ArXiv.

[56]  Zhang Yu,et al.  Multi-Modality Fusion Perception and Computing in Autonomous Driving , 2020 .

[57]  Sanja Fidler,et al.  Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D , 2020, ECCV.

[58]  Song Han,et al.  Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution , 2020, ECCV.

[59]  Homayoun Najjaran,et al.  Deep Learning Sensor Fusion for Autonomous Vehicle Perception and Localization: A Review , 2020, Sensors.

[60]  Sergio Casas,et al.  RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects , 2020, ECCV.

[61]  Xiang Bai,et al.  EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection , 2020, ECCV.

[62]  Yu Wang,et al.  1st Place Solution for Waymo Open Dataset Challenge - 3D Detection and Domain Adaptation , 2020, ArXiv.

[63]  Yu Wang,et al.  AFDet: Anchor Free One Stage 3D Object Detection , 2020, ArXiv.

[64]  Joachim Denzler,et al.  Cityscapes 3D: Dataset and Benchmark for 9 DoF Vehicle Detection , 2020, ArXiv.

[65]  Lei Zhang,et al.  Structure Aware Single-Stage 3D Object Detection From Point Cloud , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Raquel Urtasun,et al.  LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Gerald S. Buller,et al.  Full Waveform LiDAR for Adverse Weather Conditions , 2020, IEEE Transactions on Vehicular Technology.

[68]  Kyung-Ah Sohn,et al.  Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Roberto Cipolla,et al.  Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Li Yu,et al.  DA4AD: End-to-End Deep Attention-Based Visual Localization for Autonomous Driving , 2020, ECCV.

[71]  Weijing Shi,et al.  Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Seungjun Lee,et al.  Deep Learning on Radar Centric 3D Object Detection , 2020, ArXiv.

[73]  Yanan Sun,et al.  3DSSD: Point-Based 3D Single Stage Object Detector , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Zizhang Wu,et al.  SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[75]  Xiaogang Wang,et al.  PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Xin Zhao,et al.  TANet: Robust 3D Object Detection from Point Clouds with Triple Attention , 2019, AAAI.

[77]  Dragomir Anguelov,et al.  Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  D. Ramanan,et al.  What You See is What You Get: Exploiting Visibility for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Alex H. Lang,et al.  PointPainting: Sequential Fusion for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Deng Cai,et al.  PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module , 2019, AAAI.

[81]  Armin Mustafa,et al.  A*3D Dataset: Towards Autonomous Driving in Challenging Environments , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[82]  Yan Wang,et al.  Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving , 2019, ICLR.

[83]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  A. Angelova,et al.  Probabilistic Object Detection: Definition and Evaluation , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[85]  Mohak Shah,et al.  Is it Safe to Drive? An Overview of Factors, Challenges, and Datasets for Driveability Assessment in Autonomous Driving , 2018, ArXiv.

[86]  Ruigang Yang,et al.  The ApolloScape Open Dataset for Autonomous Driving and Its Application , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[87]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[88]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[89]  M. Sundermeyer,et al.  BlenderProc: Reducing the Reality Gap with Photorealistic Rendering , 2020 .

[90]  Cyrill Stachniss,et al.  RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[91]  Amin Ansari,et al.  Vehicle Detection With Automotive Radar Using Deep Learning on Range-Azimuth-Doppler Tensors , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[92]  Xiaoyong Shen,et al.  STD: Sparse-to-Dense 3D Object Detector for Point Cloud , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[93]  Bin Yang,et al.  Multi-Task Multi-Sensor Fusion for 3D Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Simon Lucey,et al.  Argoverse: 3D Tracking and Forecasting With Rich Maps , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[95]  Zengyi Qin,et al.  Triangulation Learning Network: From Monocular to Stereo 3D Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[96]  Yanbo Ma,et al.  Scanet: Spatial-channel Attention Network for 3D Object Detection , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[97]  Hairong Qi,et al.  RRPN: Radar Region Proposal Network for Object Detection in Autonomous Vehicles , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[98]  Sanja Fidler,et al.  Meta-Sim: Learning to Generate Synthetic Datasets , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[99]  Leonidas J. Guibas,et al.  Deep Hough Voting for 3D Object Detection in Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[100]  Yin Zhou,et al.  MVX-Net: Multimodal VoxelNet for 3D Object Detection , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[101]  Jian-Gang Wang,et al.  Traffic Light Recognition With High Dynamic Range Imaging and Deep Learning , 2019, IEEE Transactions on Intelligent Transportation Systems.

[102]  Haojie Li,et al.  Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[103]  Zhixin Wang,et al.  Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[104]  Yi-Ting Chen,et al.  The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[105]  Shaojie Shen,et al.  Stereo R-CNN Based 3D Object Detection for Autonomous Driving , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[106]  Vibhav Vineet,et al.  Photorealistic Image Synthesis for Object Instance Detection , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[107]  Paul Newman,et al.  Distant Vehicle Detection Using Radar and Vision , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[108]  Omar Y. Al-Jarrah,et al.  A Survey on 3D Object Detection Methods for Autonomous Driving Applications , 2019, IEEE Transactions on Intelligent Transportation Systems.

[109]  Stefano Soatto,et al.  Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors , 2019, AAAI.

[110]  Xin Zhao,et al.  3D Object Detection Using Scale Invariant and Feature Reweighting Networks , 2019, AAAI.

[111]  Yan Wang,et al.  Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[112]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[113]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[114]  Yan Lu,et al.  MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization , 2018, AAAI.

[115]  Masayoshi Tomizuka,et al.  RoarNet: A Robust 3D Object Detection based on RegiOn Approximation Refinement , 2018, 2019 IEEE Intelligent Vehicles Symposium (IV).

[116]  Stanley T. Birchfield,et al.  Structured Domain Randomization: Bridging the Reality Gap by Context-Aware Synthetic Data , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[117]  Lennart Svensson,et al.  LIDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks , 2018, Robotics Auton. Syst..

[118]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[119]  Shu Liu,et al.  IPOD: Intensive Point-based Object Detector for Point Cloud , 2018, ArXiv.

[120]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[121]  Bo Li,et al.  SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[122]  Bin Yang,et al.  Deep Continuous Fusion for Multi-sensor 3D Object Detection , 2018, ECCV.

[123]  Cewu Lu,et al.  PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation , 2018, ArXiv.

[124]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[125]  Raquel Urtasun,et al.  Deep Parametric Continuous Convolutional Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[126]  Dacheng Tao,et al.  Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[127]  Fernando García,et al.  BirdNet: A 3D Object Detection Framework from LiDAR Information , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[128]  Yong-Sheng Chen,et al.  Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[129]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[130]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[131]  Danfei Xu,et al.  PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[132]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[133]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[134]  Paulo Peixoto,et al.  Multimodal vehicle detection: fusing 3D-LIDAR and color camera data , 2017, Pattern Recognit. Lett..

[135]  Sanja Fidler,et al.  3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[136]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[137]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[138]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[139]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[140]  Nick Schneider,et al.  RegNet: Multimodal sensor registration using deep neural networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[141]  Yiannis Kompatsiaris,et al.  Deep Learning Advances in Computer Vision with 3D Data , 2017, ACM Comput. Surv..

[142]  Sylvain Paris,et al.  Deep Photo Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[143]  Murat Torlak,et al.  Automotive Radars: A review of signal processing techniques , 2017, IEEE Signal Processing Magazine.

[144]  Yiping Chen,et al.  Traffic Sign Occlusion Detection Using Mobile Laser Scanning Point Clouds , 2017, IEEE Transactions on Intelligent Transportation Systems.

[145]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[146]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[147]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[148]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[149]  Bo Li,et al.  3D fully convolutional network for vehicle detection in point cloud , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[150]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[151]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[152]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[153]  Jurica Ivošević,et al.  Night-time detection of UAVs using thermal infrared camera , 2017 .

[154]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[155]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[156]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[157]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[158]  Zsolt Kira,et al.  Fusing LIDAR and images for pedestrian detection using convolutional neural networks , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[159]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[160]  Kunsoo Huh,et al.  Sensor Fusion Algorithm Design in Detecting Vehicles Using Laser Scanner and Stereo Vision , 2016, IEEE Transactions on Intelligent Transportation Systems.

[161]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[162]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[163]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[164]  Ming-Ting Sun,et al.  Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[165]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[166]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[167]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[168]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[169]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[170]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[171]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[172]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[173]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[174]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[175]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[176]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[177]  Seungkyu Lee,et al.  Time-of-Flight Depth Camera Motion Blur Detection and Deblurring , 2014, IEEE Signal Processing Letters.

[178]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[179]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[180]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[181]  Sebastian Thrun,et al.  Automatic Online Calibration of Cameras and Lasers , 2013, Robotics: Science and Systems.

[182]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[183]  Yaser Sheikh,et al.  Monocular Object Detection Using 3D Geometric Primitives , 2012, ECCV.

[184]  Silvio Savarese,et al.  Automatic Targetless Extrinsic Calibration of a 3D Lidar and Camera by Maximizing Mutual Information , 2012, AAAI.

[185]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[186]  Klaus C. J. Dietmayer,et al.  Grid-based DBSCAN for clustering extended objects in radar data , 2012, 2012 IEEE Intelligent Vehicles Symposium.

[187]  Christoph Stiller,et al.  Velodyne SLAM , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[188]  Jong-Hun Lee,et al.  Stereo vision-based vehicle detection using a road feature and disparity histogram , 2011 .

[189]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[190]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[191]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[192]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[193]  William Whittaker,et al.  Autonomous Driving in Traffic: Boss and the Urban Challenge , 2009, AI Mag..

[194]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[195]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[196]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[197]  Ulla Wandinger,et al.  Introduction to Lidar , 2005 .

[198]  Woontack Woo,et al.  A Multi-view Camera Tracking for Modeling of Indoor Environment , 2004, PCM.

[199]  Albert P.C. Chan,et al.  Key performance indicators for measuring construction success , 2004 .

[200]  Éric Marchand,et al.  An Autonomous Active Vision System for Complete and Accurate 3D Scene Reconstruction , 1999, International Journal of Computer Vision.

[201]  J. D. Crisman,et al.  Vehicle detection in color images , 1997, Proceedings of Conference on Intelligent Transportation Systems.

[202]  A. Yuille,et al.  A common framework for image segmentation , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.