Learning Kalman Network: A deep monocular visual odometry for on-road driving

Abstract This paper proposes a Learning Kalman Network (LKN) based monocular visual odometry (VO), i.e. LKN-VO, for on-road driving. Most existing learning-based VO focus on ego-motion estimation by comparing the two most recent consecutive frames. By contrast, the LKN-VO incorporates a learning ego-motion estimation through the current measurement, and a discriminative state estimator through a sequence of previous measurements. Superior to the model-based monocular VO, a more accurate absolute scale can be learned by LKN without any geometric constraints. In contrast to the model-based Kalman Filter (KF), the optimal model parameters of LKN can be obtained from dynamic and deterministic outputs of the neural network without elaborate human design. LKN is a hybrid approach where we achieve the non-linearity of the observation model and the transition model though deep neural networks, and update the state following the Kalman probabilistic mechanism. In contrast to the learning-based state estimator, a sparse representation is further proposed to learn the correlations within the states from the car’s movement behaviour, thereby applying better filtering on the 6DOF trajectory for on-road driving. The experimental results show that the proposed LKN-VO outperforms both model-based and learning state-estimator-based monocular VO on the most well-cited on-road driving datasets, i.e. KITTI and Apolloscape. In addition, LKN-VO is integrated with dense 3D mapping, which can be deployed for simultaneous localization and mapping in urban environments.

[1]  Tom Duckett,et al.  An adaptive appearance-based map for long-term topological localization of mobile robots , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Sergey Levine,et al.  Backprop KF: Learning Discriminative Deterministic State Estimators , 2016, NIPS.

[3]  Xinyu Zhang,et al.  Semantic segmentation–aided visual odometry for urban autonomous driving , 2017 .

[4]  Paolo Valigi,et al.  Exploring Representation Learning With CNNs for Frame-to-Frame Ego-Motion Estimation , 2016, IEEE Robotics and Automation Letters.

[5]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[6]  John Folkesson,et al.  Geometric Correspondence Network for Camera Motion Estimation , 2018, IEEE Robotics and Automation Letters.

[7]  Wolfram Burgard,et al.  OctoMap: an efficient probabilistic 3D mapping framework based on octrees , 2013, Autonomous Robots.

[8]  Gabriele Costante,et al.  LS-VO: Learning Dense Optical Subspace for Robust Visual Odometry Estimation , 2017, IEEE Robotics and Automation Letters.

[9]  Li Sun,et al.  Learning Monocular Visual Odometry with Dense 3D Mapping from Dense 3D Flow , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Wolfram Burgard,et al.  VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry , 2018, IEEE Robotics and Automation Letters.

[11]  Sen Wang,et al.  DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Zhichao Yin,et al.  GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Jianqiang Wang,et al.  Vehicle Trajectory Prediction by Integrating Physics- and Maneuver-Based Approaches Using Interactive Multiple Models , 2018, IEEE Transactions on Industrial Electronics.

[15]  Li Sun,et al.  Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network , 2017, Sensors.

[16]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[17]  Ji Zhang,et al.  LOAM: Lidar Odometry and Mapping in Real-time , 2014, Robotics: Science and Systems.

[18]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Li Sun,et al.  Recurrent-OctoMap: Learning State-Based Map Refinement for Long-Term Semantic Mapping With 3-D-Lidar Data , 2018, IEEE Robotics and Automation Letters.

[20]  Sen Wang,et al.  End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks , 2018, Int. J. Robotics Res..

[21]  Dongbing Gu,et al.  UnDeepVO: Monocular Visual Odometry Through Unsupervised Deep Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Jianqiang Wang,et al.  Object Classification Using CNN-Based Fusion of Vision and LIDAR in Autonomous Vehicle Environment , 2018, IEEE Transactions on Industrial Informatics.

[23]  Wolfram Burgard,et al.  Deep Auxiliary Learning for Visual Localization and Odometry , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Tom Duckett,et al.  A Minimalistic Approach to Appearance-Based Visual SLAM , 2008, IEEE Transactions on Robotics.

[25]  Wei Pan,et al.  Building a grid-semantic map for the navigation of service robots through human–robot interaction , 2015 .

[26]  Thomas Brox,et al.  DeMoN: Depth and Motion Network for Learning Monocular Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Li Sun,et al.  A fully end-to-end deep learning approach for real-time simultaneous 3D reconstruction and material recognition , 2017, 2017 18th International Conference on Advanced Robotics (ICAR).

[28]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Dongbing Gu,et al.  Building a grid-point cloud-semantic map based on graph for the navigation of intelligent wheelchair , 2015, 2015 21st International Conference on Automation and Computing (ICAC).