Robot SLAM and navigation with multi-camera computer vision

In this thesis we focus on computer vision capabilities suitable for practical massmarket mobile robots, with an emphasis on techniques using rigs of multiple standard cameras rather than more specialised sensors. We analyse the state of the art of service robotics, and attempt to distill the vision capabilities which will be required of mobile robots over the mid and long-term future to permit autonomous localisation, mapping and navigation while integrating with other task-based vision requirements. The first main novel contribution of the work is to consider how an ad-hoc multicamera rig can be used as the basis for metric navigation competences such as featurebased Simultaneous Localisation and Mapping (SLAM). The key requirement for the use of such techniques with multiple cameras is accurate calibration of the locations of the cameras as mounted on the robot. This is a challenging problem, since we consider the general case where the cameras might be mounted all around the robot with arbitrary 3D locations and orientations, and may have fields of view which do not intersect. In the second main part of the thesis, we move away from the idea that all cameras should contribute in a uniform manner to a single consistent metric representation, inspired by recent work on SLAM systems which have demonstrated impressive performance by a combination of off-the-shelf or simple techniques which we generally categorise by the term ‘lightweight’. We develop a multi-camera mobile robot vision system which goes beyond pure localisation and SLAM to permit fully autonomous mapping navigation within a cluttered room, requiring free-space mapping and obstacle-avoiding planning capabilities. In the last part of the work we investigate the trade-offs involved in defining a camera rig suitable for this type of vision system and perform some experiments on camera placement.

[1]  Christopher G. Harris,et al.  3D positional integration from image sequences , 1988, Image Vis. Comput..

[2]  Stefano Soatto,et al.  MFM": 3-D motion from 2-D motion causally integrated over time , 2000, ECCV 2000.

[3]  Reinhard Koch,et al.  Calibration of a Multi-camera Rig from Non-overlapping Views , 2007, DAGM-Symposium.

[4]  Kurt Konolige,et al.  FrameSLAM: From Bundle Adjustment to Real-Time Visual Mapping , 2008, IEEE Transactions on Robotics.

[5]  Christopher Rasmussen,et al.  Grouping dominant orientations for ill-structured road following , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  L. W. Ratnieks How far do honey bees forage , 2000 .

[7]  Soren W. Henriksen,et al.  Manual of photogrammetry , 1980 .

[8]  Stanley T. Birchfield,et al.  Image-based segmentation of indoor corridor floors for a mobile robot , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Kurt Konolige,et al.  Frame-Frame Matching for Realtime Consistent Visual Mapping , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[10]  Jan Hoffmann,et al.  A Vision Based System for Goal-Directed Obstacle Avoidance , 2004, RoboCup.

[11]  Stephen J. Maybank,et al.  On plane-based camera calibration: A general algorithm, singularities, applications , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[12]  Andrew J. Davison,et al.  SLAM-based automatic extrinsic calibration of a multi-camera rig , 2011, 2011 IEEE International Conference on Robotics and Automation.

[13]  Ian D. Reid,et al.  Binocular Self-Alignment and Calibration from Planar Scenes , 2000, ECCV.

[14]  H. Yanco,et al.  Camera Placement and Multi-Camera Fusion for Remote Robot Operation , 2006 .

[15]  Tom Drummond,et al.  Monocular SLAM as a Graph of Coalesced Observations , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Olivier D. Faugeras,et al.  What can be seen in three dimensions with an uncalibrated stereo rig , 1992, ECCV.

[17]  R Wehner,et al.  Path integration in desert ants, Cataglyphis fortis. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  David W. Murray,et al.  Improving the Agility of Keyframe-Based SLAM , 2008, ECCV.

[20]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[21]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[23]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[24]  Seth J. Teller,et al.  Wide-Area Egomotion Estimation from Known 3D Structure , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[26]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[27]  Wolfram Burgard,et al.  Information Gain-based Exploration Using Rao-Blackwellized Particle Filters , 2005, Robotics: Science and Systems.

[28]  Friedrich Fraundorfer,et al.  Topological mapping, localization and navigation using image collections , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Yolanda González Cid,et al.  Real-time 3d SLAM with wide-angle vision , 2004 .

[30]  Walterio W. Mayol-Cuevas,et al.  Real-Time and Robust Monocular SLAM Using Predictive Multi-resolution Descriptors , 2006, ISVC.

[31]  Peter Cheeseman,et al.  A stochastic map for uncertain spatial relationships , 1988 .

[32]  Sebastian Thrun,et al.  The Graph SLAM Algorithm with Applications to Large-Scale Mapping of Urban Structures , 2006, Int. J. Robotics Res..

[33]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[34]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[35]  Ronald Azuma,et al.  A Survey of Augmented Reality , 1997, Presence: Teleoperators & Virtual Environments.

[36]  Naokazu Yokoya,et al.  Calibration method for an omnidirectional multicamera system , 2003, IS&T/SPIE Electronic Imaging.

[37]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[38]  Myung Jin Chung,et al.  Absolute motion and structure from stereo image sequences without stereo correspondence and analysis of degenerate cases , 2006, Pattern Recognit..

[39]  Andrew Calway,et al.  Real-Time Visual SLAM with Resilience to Erratic Motion , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[40]  David W. Murray,et al.  Simultaneous Localization and Map-Building Using Active Vision , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Andreas Geiger,et al.  Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[42]  Peter K. Allen,et al.  A Hybrid Approach to Topological Mobile Robot Localization , 2005 .

[43]  Paul Newman,et al.  Highly scalable appearance-only SLAM - FAB-MAP 2.0 , 2009, Robotics: Science and Systems.

[44]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[45]  Andrew J. Davison,et al.  Live dense reconstruction with a single moving camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[47]  T. Läbe AUTOMATIC RELATIVE ORIENTATION OF IMAGES , 2006 .

[48]  Tom Duckett,et al.  A Minimalistic Approach to Appearance-Based Visual SLAM , 2008, IEEE Transactions on Robotics.

[49]  Tom Duckett,et al.  A multilevel relaxation algorithm for simultaneous localization and mapping , 2005, IEEE Transactions on Robotics.

[50]  Alexei Makarenko,et al.  An experiment in integrated exploration , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[51]  Qi Wu,et al.  Camera-based clear path detection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[52]  C. V. Jawahar,et al.  Vision based navigation for mobile robots in indoor environment by teaching and playing-back scheme , 2001 .

[53]  Lasitha Piyathilaka,et al.  Multi-camera visual odometry for skid steered field robot , 2010, 2010 Fifth International Conference on Information and Automation for Sustainability.

[54]  M. Buss,et al.  A view direction planning strategy for a multi-camera vision system , 2008, 2008 International Conference on Information and Automation.

[55]  Randall Smith,et al.  Estimating Uncertain Spatial Relationships in Robotics , 1987, Autonomous Robot Vehicles.

[56]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[57]  Hauke Strasdat,et al.  Real-time monocular SLAM: Why filter? , 2010, 2010 IEEE International Conference on Robotics and Automation.

[58]  Amnon Shashua,et al.  Off-road Path Following using Region Classification and Geometric Projection Constraints , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[59]  Ronald Parr,et al.  Textured occupancy grids for monocular localization without features , 2011, 2011 IEEE International Conference on Robotics and Automation.

[60]  Kazuhiro Kosuge,et al.  Domestic Robotics , 2008, Springer Handbook of Robotics.

[61]  Michel Dhome,et al.  Calibration of Non-Overlapping Cameras---Application to Vision-Based Robotics , 2010, BMVC.

[62]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[63]  Henrik I. Christensen,et al.  Localization and navigation of a mobile robot using natural point landmarks extracted from sonar data , 2000, Robotics Auton. Syst..

[64]  Gordon Wyeth,et al.  Single camera vision-only SLAM on a suburban road network , 2008, 2008 IEEE International Conference on Robotics and Automation.

[65]  Marc Pollefeys,et al.  Static multi-camera factorization using rigid motion , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[66]  Wolfram Burgard,et al.  An efficient fastSLAM algorithm for generating maps of large-scale cyclic environments from raw laser range measurements , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[67]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68]  Reinhard Koch,et al.  Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[69]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[70]  Michael Milford Biological Navigation Systems , 2008 .

[71]  Tieniu Tan,et al.  Mobile robot self-localization based on global visual appearance features , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[72]  Y. I. Abdel-Aziz Direct linear transformation from comparator coordinates in close-range photogrammetry , 1971 .

[73]  Juan Andrade-Cetto,et al.  Path planning in belief space with pose SLAM , 2011, 2011 IEEE International Conference on Robotics and Automation.

[74]  Yangsheng Xu,et al.  Development of a hospital service robot for transporting task , 2003, IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, 2003. Proceedings. 2003.

[75]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[76]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[77]  Ian D. Reid,et al.  A Constant-Time Efficient Stereo SLAM System , 2009, BMVC.

[78]  Ian D. Reid,et al.  Mapping Large Loops with a Single Hand-Held Camera , 2007, Robotics: Science and Systems.

[79]  Teresa A. Vidal-Calleja,et al.  Fusing Monocular Information in Multicamera SLAM , 2008, IEEE Transactions on Robotics.

[80]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[81]  K. Frisch The dance language and orientation of bees , 1967 .

[82]  Kurt Konolige,et al.  Double window optimisation for constant time visual SLAM , 2011, 2011 International Conference on Computer Vision.

[83]  Robert M. Haralick,et al.  Review and analysis of solutions of the three point perspective pose estimation problem , 1994, International Journal of Computer Vision.

[84]  Javier Civera,et al.  Unified Inverse Depth Parametrization for Monocular SLAM , 2006, Robotics: Science and Systems.

[85]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[86]  Gordon Wyeth,et al.  Persistent Navigation and Mapping using a Biologically Inspired SLAM System , 2010, Int. J. Robotics Res..

[87]  Juan Andrade-Cetto,et al.  Information-Based Compact Pose SLAM , 2010, IEEE Transactions on Robotics.

[88]  Illah R. Nourbakhsh,et al.  Appearance-based place recognition for topological localization , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[89]  Andrew W. Fitzgibbon Robust registration of 2D and 3D point sets , 2003, Image Vis. Comput..

[90]  S. Baker,et al.  Lucas-Kanade 20 Years On: Part 5 , 2004 .

[91]  Reid G. Simmons,et al.  The curvature-velocity method for local obstacle avoidance , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[92]  Michael Bosse,et al.  An Atlas framework for scalable mapping , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[93]  Lily Lee,et al.  Monitoring Activities from Multiple Video Streams: Establishing a Common Coordinate Frame , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[94]  Wolfram Burgard,et al.  A real-time algorithm for mobile robot mapping with applications to multi-robot and 3D mapping , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[95]  Wolfram Burgard,et al.  Exploring Unknown Environments with Mobile Robots using Coverage Maps , 2003, IJCAI.

[96]  Supun Samarasekera,et al.  Visual Odometry System Using Multiple Stereo Cameras and Inertial Measurement Unit , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[97]  Walterio W. Mayol-Cuevas,et al.  Real-Time Model-Based SLAM Using Line Segments , 2006, ISVC.

[98]  Gordon Wyeth,et al.  Simultaneous localisation and mapping from natural landmarks using RatSLAM , 2004 .

[99]  Sebastian Thrun,et al.  Self-supervised Monocular Road Detection in Desert Terrain , 2006, Robotics: Science and Systems.

[100]  David W. Murray,et al.  Mobile Robot Localisation Using Active Vision , 1998, ECCV.

[101]  Tomás Svoboda,et al.  A Convenient Multicamera Self-Calibration for Virtual Environments , 2005, Presence: Teleoperators & Virtual Environments.

[102]  Jean Ponce,et al.  Vanishing point detection for road detection , 2009, CVPR.

[103]  Cyrill Stachniss,et al.  Exploration and mapping with mobile robots , 2006 .

[104]  Ian D. Reid,et al.  Adaptive relative bundle adjustment , 2009, Robotics: Science and Systems.

[105]  Radu Horaud,et al.  Self-calibration and Euclidean reconstruction using motions of a stereo rig , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[106]  Gordon Wyeth,et al.  Hippocampal models for simultaneous localisation and mapping on an autonomous robot , 2003 .

[107]  Hakil Kim,et al.  Layered ground floor detection for vision-based mobile robot navigation , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[108]  O. Faugeras,et al.  Self-Calibration of a Stereo Rig from Unknown Camera Motions and Point Correspondences , 2001 .

[109]  Brian Yamauchi,et al.  A frontier-based approach for autonomous exploration , 1997, Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA'97. 'Towards New Computational Principles for Robotics and Automation'.

[110]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[111]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[112]  Javier Civera,et al.  Camera self-calibration for sequential Bayesian structure from motion , 2009, 2009 IEEE International Conference on Robotics and Automation.

[113]  Wolfram Burgard,et al.  Exploration with active loop-closing for FastSLAM , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[114]  Javier Ibanez Guzman,et al.  Accurate visual odometry from a rear parking camera , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[115]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[116]  Wolfram Burgard,et al.  Monte Carlo localization for mobile robots , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[117]  Baoxin Li,et al.  Robust Ground Plane Detection with Normalized Homography in Monocular Sequences from a Robot Platform , 2006, 2006 International Conference on Image Processing.

[118]  T. Collett,et al.  The guidance of desert ants by extended landmarks. , 2001, The Journal of experimental biology.

[119]  Frank Dellaert,et al.  Probabilistic structure matching for visual SLAM with a multi-camera rig , 2010, Comput. Vis. Image Underst..

[120]  R. Y. Tsai,et al.  An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision , 1986, CVPR 1986.

[121]  James J. Little,et al.  Autonomous vision-based exploration and mapping using hybrid maps and Rao-Blackwellised particle filters , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[122]  Alexei Makarenko,et al.  Information based adaptive robotic exploration , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[123]  Wolfram Burgard,et al.  A Tree Parameterization for Efficiently Computing Maximum Likelihood Maps using Gradient Descent , 2007, Robotics: Science and Systems.

[124]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[125]  Andrew J. Davison,et al.  Lightweight SLAM and Navigation with a Multi-Camera Rig , 2011, ECMR.

[126]  Jan-Michael Frahm,et al.  Simple calibration of non-overlapping cameras with a mirror , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[127]  Michel Dhome,et al.  Fast calibration of embedded non-overlapping cameras , 2011, 2011 IEEE International Conference on Robotics and Automation.

[128]  Tom Drummond,et al.  Scalable Monocular SLAM , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[129]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[130]  Walterio W. Mayol-Cuevas,et al.  Discovering Planes and Collapsing the State Space in Visual SLAM , 2007, BMVC.

[131]  Hirokazu Kato,et al.  3D live: real time captured content for mixed reality , 2002, Proceedings. International Symposium on Mixed and Augmented Reality.

[132]  Manolis I. A. Lourakis,et al.  SBA: A software package for generic sparse bundle adjustment , 2009, TOMS.

[133]  Gordon Wyeth,et al.  Mapping a Suburb With a Single Camera Using a Biologically Inspired SLAM System , 2008, IEEE Transactions on Robotics.

[134]  Ranxiao Frances Wang,et al.  The effect of active selection in human path integration. , 2010, Journal of vision.

[135]  Olivier Koch Body-relative navigation guidance using uncalibrated cameras , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[136]  Richard Szeliski,et al.  Visual odometry and map correlation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[137]  Lei Wang,et al.  Auto-Calibration of a Compound-Type Omnidirectional Camera , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[138]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[139]  Andrew J. Davison,et al.  Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.