Featureless Motion Vector-Based Simultaneous Localization, Planar Surface Extraction, and Moving Obstacle Tracking

Motion vectors (MVs) characterize the movement of pixel blocks in video streams and are readily available. MVs not only allow us to avoid expensive feature transform and correspondence computations but also provide the motion information for both the environment and moving obstacles. This enables us to develop a new framework that is capable of simultaneous localization, scene mapping, and moving obstacle tracking. This method first extracts planes from MVs and their corresponding pixel macro blocks (MBs) using properties of plane-induced homographies. We then classify MBs as stationary or moving using geometric constraints on MVs. Planes are labeled as part of the stationary scene or moving obstacles using MB voting. Therefore, we can establish planes as observations for extended Kalman filters (EKFs) for both the stationary scene and moving objects. We have implemented the proposed method. The results show that the proposed method can establish plane-based rectilinear scene structure and detect moving objects while achieving similar localization accuracy of 1-Point EKF. More specifically, the system detects moving obstacles at a true positive rate of 96.6 % with a relative absolution trajectory error of no more than 2.53 %.

[1]  Walterio W. Mayol-Cuevas,et al.  Discovering Planes and Collapsing the State Space in Visual SLAM , 2007, BMVC.

[2]  Tobias Pietzsch Planar Features for Visual SLAM , 2008, KI.

[3]  Yan Lu,et al.  A two-view based multilayer feature graph for robot navigation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[4]  Javier Civera,et al.  1‐Point RANSAC for extended Kalman filtering: Application to real‐time structure from motion and visual odometry , 2010, J. Field Robotics.

[5]  Joonwhoan Lee,et al.  Object tracking in MPEG compressed video using mean-shift algorithm , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[6]  Yan Lu,et al.  Automatic building exterior mapping using multilayer feature graphs , 2013, 2013 IEEE International Conference on Automation Science and Engineering (CASE).

[7]  Somkiat Wangsiripitak,et al.  Avoiding moving outliers in visual SLAM by tracking moving objects , 2009, 2009 IEEE International Conference on Robotics and Automation.

[8]  Ji Zhang,et al.  Error Aware Monocular Visual Odometry using Vertical Line Pairs for Small Robots in Urban Areas , 2010, AAAI.

[9]  Yan Lu,et al.  High level landmark-based visual navigation using unsupervised geometric constraints in local bundle adjustment , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Lorenzo Favalli,et al.  Object tracking for retrieval applications in MPEG-2 , 2000, IEEE Trans. Circuits Syst. Video Technol..

[11]  Takanori Yokoyama,et al.  Motion Vector Based Moving Object Detection and Tracking in the MPEG Compressed Domain , 2009, 2009 Seventh International Workshop on Content-Based Multimedia Indexing.

[12]  Óscar Martínez Mozos,et al.  A comparative evaluation of interest point detectors and local descriptors for visual SLAM , 2010, Machine Vision and Applications.

[13]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[14]  Atsushi Imiya,et al.  Dominant plane detection from optical flow for robot navigation , 2006, Pattern Recognit. Lett..

[15]  Dezhen Song,et al.  Toward featureless visual navigation: Simultaneous localization and planar surface extraction using motion vectors in video streams , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Kyoung Mu Lee,et al.  Visual SLAM with Line and Corner Features , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  R. Venkatesh Babu,et al.  Compressed domain motion segmentation for video object extraction , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Sridha Sridharan,et al.  Improved Simultaneous Computation of Motion Detection and Optical Flow for Object Tracking , 2009, 2009 Digital Image Computing: Techniques and Applications.

[19]  Andrea Fusiello,et al.  Robust Multiple Structures Estimation with J-Linkage , 2008, ECCV.

[20]  Danping Zou,et al.  CoSLAM: Collaborative Visual SLAM in Dynamic Environments , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  C. Laugier,et al.  Real-time moving obstacle detection using optical flow models , 2006, 2006 IEEE Intelligent Vehicles Symposium.

[22]  Yin-Tien Wang,et al.  Visual SLAM and Moving-object Detection for a Small-size Humanoid Robot , 2010 .

[23]  Tom Drummond,et al.  Edge landmarks in monocular SLAM , 2009, Image Vis. Comput..

[24]  Francisco Angel Moreno,et al.  The Málaga urban dataset: High-rate stereo and LiDAR in a realistic urban scenario , 2014, Int. J. Robotics Res..

[25]  Walterio W. Mayol-Cuevas,et al.  Discovering Higher Level Structure in Visual SLAM , 2008, IEEE Transactions on Robotics.

[26]  Ian D. Reid,et al.  Growing semantically meaningful models for visual SLAM , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.