Toward evaluation of visual navigation algorithms on RGB-D data from the first- and second-generation Kinect

Although the introduction of commercial RGB-D sensors has enabled significant progress in the visual navigation methods for mobile robots, the structured-light-based sensors, like Microsoft Kinect and Asus Xtion Pro Live, have some important limitations with respect to their range, field of view, and depth measurements accuracy. The recent introduction of the second- generation Kinect, which is based on the time-of-flight measurement principle, brought to the robotics and computer vision researchers a sensor that overcomes some of these limitations. However, as the new Kinect is, just like the older one, intended for computer games and human motion capture rather than for navigation, it is unclear how much the navigation methods, such as visual odometry and SLAM, can benefit from the improved parameters. While there are many publicly available RGB-D data sets, only few of them provide ground truth information necessary for evaluating navigation methods, and to the best of our knowledge, none of them contains sequences registered with the new version of Kinect. Therefore, this paper describes a new RGB-D data set, which is a first attempt to systematically evaluate the indoor navigation algorithms on data from two different sensors in the same environment and along the same trajectories. This data set contains synchronized RGB-D frames from both sensors and the appropriate ground truth from an external motion capture system based on distributed cameras. We describe in details the data registration procedure and then evaluate our RGB-D visual odometry algorithm on the obtained sequences, investigating how the specific properties and limitations of both sensors influence the performance of this navigation method.

[1]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[2]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[3]  Andrew W. Fitzgibbon,et al.  A rational function lens distortion model for general cameras , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Pieter Abbeel,et al.  BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Friedrich Fraundorfer,et al.  Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .

[6]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Germano Veiga,et al.  Evaluation of Depth Sensors for Robotic Applications , 2015, 2015 IEEE International Conference on Autonomous Robot Systems and Competitions.

[8]  Michal R. Nowicki,et al.  On the Performance of Pose-Based RGB-D Visual Navigation Systems , 2014, ACCV.

[9]  Ian D. Reid,et al.  On combining visual SLAM and visual odometry , 2010, 2010 IEEE International Conference on Robotics and Automation.

[10]  Jan-Michael Frahm,et al.  USAC: A Universal Framework for Random Sample Consensus , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[12]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Roland Siegwart,et al.  Kinect v2 for mobile robot navigation: Evaluation and modeling , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[14]  Jennifer C Molloy,et al.  The Open Knowledge Foundation: Open Data Means Better Science , 2011, PLoS biology.

[15]  Ferdinand Fuhrmann,et al.  EVALUATION OF THE SPATIAL RESOLUTION ACCURACY OF THE FACE TRACKING SYSTEM FOR KINECT FOR WINDOWS V 1 AND V 2 , 2014 .

[16]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[17]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[18]  Giulio Fontana,et al.  Rawseeds ground truth collection systems for indoor self-localization and mapping , 2009, Auton. Robots.

[19]  Marek Kraft,et al.  Calibration of the Multi-camera Registration System for Visual Navigation Benchmarking , 2014 .

[20]  Wolfram Burgard,et al.  Towards a benchmark for RGB-D SLAM evaluation , 2011, RSS 2011.

[21]  Federico Tombari,et al.  Analysis and Evaluation Between the First and the Second Generation of RGB-D Sensors , 2015, IEEE Sensors Journal.

[22]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[23]  Wolfram Burgard,et al.  A Tutorial on Graph-Based SLAM , 2010, IEEE Intelligent Transportation Systems Magazine.

[24]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Winston Churchill,et al.  The New College Vision and Laser Data Set , 2009, Int. J. Robotics Res..

[26]  John J. Leonard,et al.  The MIT Stata Center dataset , 2013, Int. J. Robotics Res..

[27]  H. Macher,et al.  FIRST EXPERIENCES WITH KINECT V2 SENSOR FOR CLOSE RANGE 3D MODELLING , 2015 .

[28]  Wolfram Burgard,et al.  3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.

[29]  Markus Vincze,et al.  Segmentation of unknown objects in indoor environments , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[31]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[32]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[33]  Matteo Munaro,et al.  Performance evaluation of the 1st and 2nd generation Kinect for multimedia applications , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[34]  Michal R. Nowicki,et al.  Combining photometric and depth data for lightweight and robust visual odometry , 2013, 2013 European Conference on Mobile Robots.

[35]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[36]  F. Fraundorfer,et al.  Visual Odometry : Part II: Matching, Robustness, Optimization, and Applications , 2012, IEEE Robotics & Automation Magazine.

[37]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.