CTIN: Robust Contextual Transformer Network for Inertial Navigation

Recently, data-driven inertial navigation approaches have demonstrated their capability of using well-trained neural networks to obtain accurate position estimates from inertial measurement units (IMU) measurements. In this paper, we propose a novel robust Contextual Transformer-based network for Inertial Navigation (CTIN) to accurately predict velocity and trajectory. To this end, we first design a ResNetbased encoder enhanced by local and global multi-head selfattention to capture spatial contextual information from IMU measurements. Then we fuse these spatial representations with temporal knowledge by leveraging multi-head attention in the Transformer decoder. Finally, multi-task learning with uncertainty reduction is leveraged to improve learning efficiency and prediction accuracy of velocity and trajectory. Through extensive experiments over a wide range of inertial datasets (e.g., RIDI, OxIOD, RoNIN, IDOL, and our own), CTIN is very robust and outperforms state-of-the-art models.

[1]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  John Weston,et al.  Strapdown Inertial Navigation Technology , 1997 .

[3]  Romit Roy Choudhury,et al.  Closing the Gaps in Inertial Motion Tracking , 2018, MobiCom.

[4]  Jie Liu,et al.  A realistic evaluation and comparison of indoor location technologies: experiences and lessons learned , 2015, IPSN.

[5]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[6]  Zoran A. Salcic,et al.  An enhanced pedestrian dead reckoning approach for pedestrian tracking using smartphones , 2015, 2015 IEEE Tenth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP).

[7]  Agata Brajdic,et al.  Walk detection and step counting on unconstrained smartphones , 2013, UbiComp.

[8]  Valentin Peretroukhin,et al.  Robust Data-Driven Zero-Velocity Detection for Foot-Mounted Inertial Navigation , 2020, IEEE Sensors Journal.

[9]  Dongdong Wang,et al.  Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Christopher P. Reale,et al.  Multivariate Uncertainty in Deep Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[12]  Sergey Levine,et al.  Backprop KF: Learning Discriminative Deterministic State Estimators , 2016, NIPS.

[13]  Hironobu Takagi,et al.  NavCog: a navigational cognitive assistant for the blind , 2016, MobileHCI.

[14]  Roland Siegwart,et al.  Extending kalibr: Calibrating the extrinsics of multiple IMUs and of individual axes , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Hideaki Uchiyama,et al.  Understanding the Behavior of Data-Driven Inertial Odometry With Kinematics-Mimicking Deep Neural Network , 2021, IEEE Access.

[16]  Ashish Vaswani,et al.  Stand-Alone Self-Attention in Vision Models , 2019, NeurIPS.

[17]  Eric Foxlin,et al.  Pedestrian tracking with shoe-mounted inertial sensors , 2005, IEEE Computer Graphics and Applications.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[20]  Jörg Stückler,et al.  Direct visual-inertial odometry with stereo cameras , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Siwei Li,et al.  An indoor localization system by fusing smartphone inertial sensors and bluetooth low energy beacons , 2017, 2017 2nd International Conference on Frontiers of Sensors Technologies (ICFST).

[22]  Sachini Herath,et al.  RoNIN: Robust Neural Inertial Navigation in the Wild: Benchmark, Evaluations, & New Methods , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Wei Wang,et al.  OxIOD: The Dataset for Deep Inertial Odometry , 2018, ArXiv.

[24]  Michael Bosse,et al.  Keyframe-based visual–inertial odometry using nonlinear optimization , 2015, Int. J. Robotics Res..

[25]  P. Savage STRAPDOWN INERTIAL NAVIGATION INTEGRATION ALGORITHM DESIGN. PART 2: VELOCITY AND POSITION ALGORITHMS , 1998 .

[26]  Martin Brossard,et al.  AI-IMU Dead-Reckoning , 2019, IEEE Transactions on Intelligent Vehicles.

[27]  Agathoniki Trigoni,et al.  IONet: Learning to Cure the Curse of Drift in Inertial Odometry , 2018, AAAI.

[28]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[29]  Qi Shan,et al.  RIDI: Robust IMU Double Integration , 2017, ECCV.

[30]  Weiwei Xing,et al.  ADCNN: Towards learning adaptive dilation for convolutional neural networks , 2021, Pattern Recognit..

[31]  Wenxin Liu,et al.  TLIO: Tight Learned Inertial Odometry , 2020, IEEE Robotics and Automation Letters.

[32]  Tao Mei,et al.  Contextual Transformer Networks for Visual Recognition , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[34]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35]  Stephen Lin,et al.  Local Relation Networks for Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[37]  Ji Zhang,et al.  LOAM: Lidar Odometry and Mapping in Real-time , 2014, Robotics: Science and Systems.

[38]  Lutz Prechelt,et al.  Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.

[39]  Kris Kitani,et al.  IDOL: Inertial Deep Orientation-Estimation and Localization , 2021, AAAI.

[40]  Thomas B. Schön,et al.  Using Inertial Sensors for Position and Orientation Estimation , 2017, Found. Trends Signal Process..

[41]  Weiwei Xing,et al.  Active dropblock: Method to enhance deep model accuracy and robustness , 2021, Neurocomputing.

[42]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[43]  Wolfram Burgard,et al.  Towards a benchmark for RGB-D SLAM evaluation , 2011, RSS 2011.

[44]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Weiwei Xing,et al.  AEVRNet: Adaptive exploration network with variance reduced optimization for visual tracking , 2021, Neurocomputing.

[46]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.