Augmented Reality Driving Using Semantic Geo-Registration

We propose a new approach that utilizes semantic information to register 2D monocular video frames to the world using 3D georeferenced data, for augmented reality driving applications. The geo-registration process uses our predicted vehicle pose to generate a rendered depth map for each frame, allowing 3D graphics to be convincingly blended with the real world view. We also estimate absolute depth values for dynamic objects, up to 120 meters, based on the rendered depth map and update the rendered depth map to reflect scene changes over time. This process also creates opportunistic global heading measurements, which are fused with other sensors, to improve estimates of the 6 degrees-of- freedom global pose of the vehicle over state-of-the-art outdoor augmented reality systems [5]–, [19]. We evaluate the navigation accuracy and depth map quality of our system on a driving vehicle within various large-scale environments for producing realistic augmentations.

[1]  Vincent Lepetit,et al.  Instant Outdoor Localization and SLAM Initialization from 2.5D Maps , 2015, IEEE Transactions on Visualization and Computer Graphics.

[2]  Steven Zhiying Zhou,et al.  Positioning, tracking and mapping for outdoor augmentation , 2010, 2010 IEEE International Symposium on Mixed and Augmented Reality.

[3]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[4]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[5]  David Nistér,et al.  Preemptive RANSAC for live structure and motion estimation , 2005, Machine Vision and Applications.

[6]  Hongsheng Zhang,et al.  Improved registration for vehicular AR using auto-harmonization , 2014, 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[7]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[8]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling , 2015, CVPR 2015.

[9]  Marc Pollefeys,et al.  Large Scale Visual Geo-Localization of Images in Mountainous Terrain , 2012, ECCV.

[10]  Xiaogang Wang,et al.  Convolutional neural networks with low-rank regularization , 2015, ICLR.

[11]  Robert C. Bolles,et al.  Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.

[12]  Byoung-Jun Park,et al.  Augmented reality for collision warning and path guide in a vehicle , 2015, VRST.

[13]  Salah Sukkarieh,et al.  Visual-Inertial-Aided Navigation for High-Dynamic Motion in Built Environments Without Initial Conditions , 2012, IEEE Transactions on Robotics.

[14]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[15]  Zhiwei Zhu,et al.  Image to LIDAR matching for geotagging in urban environments , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[16]  Supun Samarasekera,et al.  Augmented reality binoculars on the move. , 2014, International Symposium on Mixed and Augmented Reality.

[17]  Daniel P. Huttenlocher,et al.  Distance Transforms of Sampled Functions , 2012, Theory Comput..

[18]  Huan Liu,et al.  Class-specific grasping of 3D objects from a single 2D image , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Supun Samarasekera,et al.  Multi-sensor navigation algorithm using monocular camera, IMU and GPS for large scale augmented reality , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[20]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[21]  Frank Dellaert,et al.  Initialization techniques for 3D SLAM: A survey on rotation estimation and its use in pose graph optimization , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).