Hybrid visual and inertial position and orientation estimation based on known urban 3D models

More and more pedestrians own devices (as a smartphone) that integrate a wide array of low-cost sensors (camera, IMU, magnetometer and GNSS receiver). GNSS is usually used for pedestrian localization in urban environment, but signal suffers of an inaccuracy of several meters. In order to have a more accurate localization and improve pedestrian navigation and urban mobility, we present a method for city-scale localization with a handheld device. Our central idea is to estimate the 3D location and 3D orientation of the phone camera based on the knowledge of the street furnitures, which have a high repeatability and a large coverage area in the city. Firstly, the use of inertial measurements acquired with an IMU in the vision based method allows to accelerate the calculation of the position and orientation. Secondly, the weighted fusion between the rotation matrices calculated with the vision and the inertial processes allows to give the more importance in the calculation with the highest confidence. With a good points selection, this provides a localization that is in the GNSS post-processed measurement precision use for determining the position and the orientation of the street furnitures. Performances are presented in terms of accuracy of positionning. The final aim is to have with our method a precision good enough to be able to propose in future works a on site display in augmented reality.

[1]  Avideh Zakhor,et al.  Single view pose estimation of mobile devices in urban environments , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[2]  Rashid Ansari,et al.  Efficient refinement of GPS-based localization in urban areas using visual information and sensor parameter , 2015, ArXiv.

[3]  Avideh Zakhor,et al.  Image-Based Positioning of Mobile Devices in Indoor Environments , 2015, Multimodal Location Estimation of Videos and Images.

[4]  Tom Drummond,et al.  Robust visual tracking for non-instrumental augmented reality , 2003, The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, 2003. Proceedings..

[5]  M. El Badaoui El Najjar,et al.  Localisation in urban environment using GPS and INS aided by monocular vision system and 3D geographical model , 2007, 2007 IEEE Intelligent Vehicles Symposium.

[6]  Valérie Renaudin,et al.  Magnetic, Acceleration Fields and Gyroscope Quaternion (MAGYQ)-Based Attitude Estimation with Smartphone Sensors for Indoor Pedestrian Navigation , 2014, Sensors.

[7]  Gudrun Klinker,et al.  Absolute Spatial Context-aware visual feature descriptors for outdoor handheld camera localization overcoming visual repetitiveness in urban environments , 2015, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[8]  Lucas Paletta,et al.  A Mobile Vision System for Urban Detection with Informative Local Descriptors , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[9]  Davide Scaramuzza,et al.  Robot localization using soft object detection , 2012, 2012 IEEE International Conference on Robotics and Automation.

[10]  Davide Scaramuzza,et al.  Air‐ground Matching: Appearance‐based GPS‐denied Urban Localization of Micro Aerial Vehicles , 2015, J. Field Robotics.

[11]  Vincent Lepetit,et al.  Instant Outdoor Localization and SLAM Initialization from 2.5D Maps , 2015, IEEE Transactions on Visualization and Computer Graphics.

[12]  Avideh Zakhor,et al.  Location-based image retrieval for urban environments , 2011, 2011 18th IEEE International Conference on Image Processing.

[13]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[14]  I. E. Sutherland Three-dimensional data input by tablet , 1974, COMG.

[15]  Adrian Kaehler,et al.  Learning opencv, 1st edition , 2008 .

[16]  Wei Zhang,et al.  Image Based Localization in Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[17]  C. V. Jawahar,et al.  Image Retrieval Using Textual Cues , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Marc Pollefeys,et al.  Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition , 2011, International Journal of Computer Vision.

[19]  Philip David,et al.  Simultaneous pose and correspondence determination using line features , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[20]  Zhen Li,et al.  A Survey on Mobile Landmark Recognition for Information Retrieval , 2009, 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware.

[21]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Jean-Yves Bouguet,et al.  Camera calibration toolbox for matlab , 2001 .

[23]  Gerald Friedland,et al.  Multimodal Location Estimation of Videos and Images , 2014 .

[24]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[25]  Vincent Lepetit,et al.  Accurate Non-Iterative O(n) Solution to the PnP Problem , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26]  Éric Marchand,et al.  Pose Estimation for Augmented Reality: A Hands-On Survey , 2016, IEEE Transactions on Visualization and Computer Graphics.

[27]  Larry S. Davis,et al.  Model-based object pose in 25 lines of code , 1992, International Journal of Computer Vision.