First-person indoor navigation via vision-inertial data fusion

In this paper, we aim to enhance the first-person indoor navigation and scene understanding experience by fusing inertial data collected from a smartphone carried by the user with the vision information obtained through the phone's camera. We employed the concept of vanishing directions together with the orthogonality constraints of the man-made environments in an expectation maximization framework to estimate person's orientation with respect to the known indoor coordinates from video frames. This framework allows to include prior information about camera rotation axis for better estimations as well as to select candidate edge-lines for estimation of hallways' depth and width from monocular video frames, and 3D modeling of the scene. Our proposed algorithm concurrently combines the vision-based estimated orientation with the inertial data using a Kalman filter in order to refine estimations and remove substantial measurement drift from inertial sensors. We evaluated the performance of our vision-inertial data fusion method on an IMU-augmented video recorded from a rotary hallway in which a participant completed a full lap. We demonstrated that this fusion provides virtually drift-free instantaneous information about the person's relative orientation. We were able to estimate hallways' depth and width, and generate a closed-path map from the rotary hallway over a roughly 60-meter lap.

[1]  Gaurav S. Sukhatme,et al.  Combined Visual and Inertial Navigation for an Unmanned Aerial Vehicle , 2008, FSR.

[2]  Silvio Savarese,et al.  Semantic structure from motion with points, regions, and objects , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Stergios I. Roumeliotis,et al.  Augmenting inertial navigation with image-based motion estimation , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[4]  Pascal Vasseur,et al.  Globally optimal line clustering and vanishing point estimation in Manhattan world , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Mohinder S. Grewal,et al.  Kalman Filtering: Theory and Practice , 1993 .

[6]  Seth J. Teller,et al.  Automatic recovery of relative camera rotations for urban scenes , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[7]  Wei Zhang,et al.  Video Compass , 2002, ECCV.

[8]  Robert T. Collins,et al.  Vanishing point calculation as a statistical inference on the unit sphere , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[9]  J. Borenstein,et al.  Personal Dead-reckoning System for GPS-denied Environments , 2007, 2007 IEEE International Workshop on Safety, Security and Rescue Robotics.

[10]  Kuk-Jin Yoon,et al.  Real-time joint estimation of camera orientation and vanishing points , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Horst Bischof,et al.  Online Auto-Calibration in Man-Made Worlds , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[13]  Petar M. Djuric,et al.  Indoor Tracking: Theory, Methods, and Technologies , 2015, IEEE Transactions on Vehicular Technology.

[14]  Mohinder S. Grewal,et al.  Kalman Filtering: Theory and Practice Using MATLAB , 2001 .

[15]  Long Quan,et al.  Determining perspective structures using hierarchical Hough transform , 1989, Pattern Recognit. Lett..

[16]  Ian D. Reid,et al.  Manhattan scene understanding using monocular, stereo, and 3D features , 2011, 2011 International Conference on Computer Vision.

[17]  Masayuki Murata,et al.  Achieving Practical and Accurate Indoor Navigation for People with Visual Impairments , 2017, W4A.

[18]  Nirwan Ansari,et al.  Steering a robot with vanishing points , 1993, IEEE Trans. Robotics Autom..

[19]  Hironobu Takagi,et al.  NavCog: a navigational cognitive assistant for the blind , 2016, MobileHCI.

[20]  F. Seco,et al.  A comparison of Pedestrian Dead-Reckoning algorithms using a low-cost MEMS IMU , 2009, 2009 IEEE International Symposium on Intelligent Signal Processing.

[21]  Rogelio Lozano,et al.  Vision Based Tracking for a Quadrotor Using Vanishing Points , 2012, J. Intell. Robotic Syst..

[22]  Oliver J. Woodman,et al.  An introduction to inertial navigation , 2007 .

[23]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[24]  S. Rock,et al.  Passive GPS-Free Navigation for Small UAVs , 2005, 2005 IEEE Aerospace Conference.

[25]  Richard Szeliski,et al.  A Multi-stage Linear Approach to Structure from Motion , 2010, ECCV Workshops.

[26]  Hong-wei Wang,et al.  Algorithm of roll, pitch and yaw angular velocity for three-axis gyroscope , 2010, 2010 The 2nd Conference on Environmental Science and Information Application Technology.

[27]  Anshul Kumar,et al.  Strap-down Pedestrian Dead-Reckoning system , 2011, 2011 International Conference on Indoor Positioning and Indoor Navigation.

[28]  Carsten Rother A new approach to vanishing point detection in architectural environments , 2002, Image Vis. Comput..