AN AUTOMATIC KEY-FRAME SELECTION METHOD FOR VISUAL ODOMETRY BASED ON THE IMPROVED PWC-NET

Abstract. In order to quick response to the rapid changes of mobile platforms in complex situations such as speedy changing direction or camera shake, visual odometry/visual simultaneous localization and mapping (VO/VSLAM) always needs a high frame rate vision sensor. However, the high frame rate of the sensor will affect the real-time performance of the odometry. Therefore, we need to investigate how to make a balance between the frame rate and the pose quality of the sensor. In this paper, we propose an automatic key-frame method based on the improved PWC-Net for mobile platforms, which can improve the pose tracking quality of odometry, the error caused by dynamic blur and the global robustness. First, a two-step decomposition is used to calculate the change of inter-frame attitude, and then, key-frames are added by the improved PWC-Net or automatically selected based on the motion state of the vehicle predicted by pose change with a short time interval. To evaluate the method, we conduct extensive experiments on KITTI dataset based on monocular visual odometry. The results indicate that our method can keep the pose tracking quality while ensuring the real-time performance.

[1]  Li Yan,et al.  Multi-Stage Matching Approach for Mobile Platform Visual Imagery , 2019, IEEE Access.

[2]  Hujun Bao,et al.  Robust monocular SLAM in dynamic environments , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[3]  Yi Lin,et al.  Autonomous aerial navigation using monocular visual‐inertial fusion , 2018, J. Field Robotics.

[4]  David Nistér,et al.  Preemptive RANSAC for live structure and motion estimation , 2005, Machine Vision and Applications.

[5]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[6]  Jan Kautz,et al.  PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Lei Guo,et al.  An Automatic Key-Frame Selection Method for Monocular Visual Odometry of Ground Vehicle , 2019, IEEE Access.

[8]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[9]  José Ruíz Ascencio,et al.  Visual simultaneous localization and mapping: a survey , 2012, Artificial Intelligence Review.

[10]  James R. Bergen,et al.  Visual odometry for ground vehicle applications , 2006, J. Field Robotics.

[11]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[13]  Wayne H. Wolf,et al.  Key frame selection by motion analysis , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[14]  Feng Liu,et al.  Context-Aware Synthesis for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[16]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.