Project AutoVision: Localization and 3D Scene Perception for an Autonomous Vehicle with a Multi-Camera System

Project AutoVision aims to develop localization and 3D scene perception capabilities for a self-driving vehicle. Such capabilities will enable autonomous navigation in urban and rural environments, in day and night, and with cameras as the only exteroceptive sensors. The sensor suite employs many cameras for both 360-degree coverage and accurate multi-view stereo; the use of low-cost cameras keeps the cost of this sensor suite to a minimum. In addition, the project seeks to extend the operating envelope to include GNSS-less conditions which are typical for environments with tall buildings, foliage, and tunnels. Emphasis is placed on leveraging multi-view geometry and deep learning to enable the vehicle to localize and perceive in 3D space. This paper presents an overview of the project, and describes the sensor suite and current progress in the areas of calibration, localization, and perception.

[1]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[2]  Gim Hee Lee,et al.  CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Dean Brown,et al.  Decentering distortion of lenses , 1966 .

[4]  Gim Hee Lee,et al.  Image-Based Geo-Localization Using Satellite Imagery , 2019, International Journal of Computer Vision.

[5]  Marc Pollefeys,et al.  Self-calibration and visual SLAM with a multi-camera system on a micro aerial vehicle , 2014, Auton. Robots.

[6]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[7]  Marc Pollefeys,et al.  Real-Time Direct Dense Matching on Fisheye Images Using Plane-Sweeping Stereo , 2014, 2014 2nd International Conference on 3D Vision.

[8]  Helder Araújo,et al.  Issues on the geometry of central catadioptric image formation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  Torsten Sattler,et al.  Direct visual odometry for a fisheye-stereo camera , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Marc Pollefeys,et al.  Infrastructure-based calibration of a multi-camera rig , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Lionel Heng,et al.  Semi-direct visual odometry for a fisheye-stereo camera , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Daniel Cremers,et al.  A Photometrically Calibrated Benchmark For Monocular Visual Odometry , 2016, ArXiv.

[13]  Torsten Sattler,et al.  Real-Time Dense Mapping for Self-Driving Vehicles using Fisheye Cameras , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[14]  Olaf Kähler,et al.  Hierarchical Voxel Block Hashing for Efficient Integration of Depth Images , 2016, IEEE Robotics and Automation Letters.

[15]  Olaf Kähler,et al.  Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices , 2015, IEEE Transactions on Visualization and Computer Graphics.

[16]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17]  Kostas Daniilidis,et al.  A Unifying Theory for Central Panoramic Systems and Practical Applications , 2000, ECCV.

[18]  Torsten Sattler,et al.  Efficient & Effective Prioritized Matching for Large-Scale Image-Based Localization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Andreas Geiger,et al.  Automatic camera and range sensor calibration using a single shot , 2012, 2012 IEEE International Conference on Robotics and Automation.

[20]  Roland Siegwart,et al.  Toward automated driving in cities using close-to-market sensors: An overview of the V-Charge Project , 2013, 2013 IEEE Intelligent Vehicles Symposium (IV).

[21]  Julius Ziegler,et al.  Making Bertha Drive—An Autonomous Journey on a Historic Route , 2014, IEEE Intelligent Transportation Systems Magazine.

[22]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[23]  Torsten Sattler,et al.  3D visual perception for self-driving cars using a multi-camera system: Calibration, mapping, localization, and obstacle detection , 2017, Image Vis. Comput..

[24]  Torsten Sattler,et al.  Towards Robust Visual Odometry with a Multi-Camera System , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[26]  Torsten Sattler,et al.  Efficient 2D-3D Matching for Multi-Camera Visual Localization , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[27]  Roland Siegwart,et al.  Automated valet parking and charging for e-mobility , 2016, 2016 IEEE Intelligent Vehicles Symposium (IV).

[28]  Alberto Broggi,et al.  PROUD—Public Road Urban Driverless-Car Test , 2015, IEEE Transactions on Intelligent Transportation Systems.

[29]  Marc Pollefeys,et al.  Minimal solutions for the multi-camera pose estimation problem , 2015, Int. J. Robotics Res..

[30]  Roland Siegwart,et al.  Using multi-camera systems in robotics: Efficient solutions to the NPnP problem , 2013, 2013 IEEE International Conference on Robotics and Automation.

[31]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Edwin Olson,et al.  AprilTag 2: Efficient and robust fiducial detection , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.