Body-relative Navigation Guidance using Uncalibrated Cameras by Olivier Koch

The ability to navigate through the world is an essential capability to humans. In a variety of situations, people do not have the time, the opportunity or the capability to learn the layout of the environment before visiting an area. Examples include soldiers in the field entering an unknown building, firefighters responding to an emergency, or a visually impaired person walking through the city. In absence of external source of localization (such as GPS), the system must rely on internal sensing to provide navigation guidance to the user. In order to address real-world situations, the method must provide spatially extended, temporally consistent navigation guidance, through cluttered and dynamic environments. While recent research has largely focused on metric methods based on calibrated cameras, the work presented in this thesis demonstrates a novel approach to navigation using uncalibrated cameras. During the first visit of the environment, the method builds a topological representation of the user’s exploration path, which we refer to as the place graph. The method then provides navigation guidance from any place to any other in the explored environment. On one hand, a localization algorithm determines the location of the user in the graph. On the other hand, a rotation guidance algorithm provides a directional cue towards the next graph node in the user’s body frame. Our method makes little assumption about the environment except that it contains descriptive visual features. It requires no intrinsic or extrinsic camera calibration, and relies instead on a method that learns the correlation between user rotation and feature correspondence across cameras. We validate our approach using several ground truth datasets. In addition, we show that our approach is capable of guiding a robot equipped with a local obstacle avoidance capability through real, cluttered environments. Finally, we validate our system with nine untrained users through several kilometers of indoor environments. Thesis Supervisor: Seth Teller Title: Professor of Computer Science and Engineering

[1]  Trevor Darrell,et al.  Adaptive Vocabulary Forests br Dynamic Indexing and Category Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[3]  Thomas S. Collett,et al.  Landmark learning and guidance in insects , 1992 .

[4]  Mark D. Dunlop,et al.  An Experimental Investigation into Wayfinding Directions for Visually Impaired People , 2005, Personal and Ubiquitous Computing.

[5]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[6]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[7]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  R. Sukthankar,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Robert C. Bolles,et al.  Mapping, navigation, and learning for off‐road traversal , 2009, J. Field Robotics.

[11]  Dharma P. Agrawal,et al.  GPS: Location-Tracking Technology , 2002, Computer.

[12]  John J. Leonard,et al.  Consistent, Convergent, and Constant-Time SLAM , 2003, IJCAI.

[13]  Sebastian Thrun,et al.  Learning Metric-Topological Maps for Indoor Mobile Robot Navigation , 1998, Artif. Intell..

[14]  Vincent Lepetit,et al.  A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Michael G. Strintzis,et al.  Unsupervised motion classification by means of efficient feature selection and tracking , 2004, Proceedings. 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004..

[16]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[17]  Manuel Menezes de Oliveira Neto,et al.  Real-time line detection through an improved Hough transform voting scheme , 2008, Pattern Recognit..

[18]  Gordon Wyeth,et al.  Efficient Goal Directed Navigation using RatSLAM , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[19]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[20]  Bernhard Schölkopf,et al.  Where did I take that snapshot? Scene-based homing by image matching , 1998, Biological Cybernetics.

[21]  Lina María Paz,et al.  Large-Scale 6-DOF SLAM With Stereo-in-Hand , 2008, IEEE Transactions on Robotics.

[22]  Wei Zhang,et al.  Hierarchical building recognition , 2007, Image Vis. Comput..

[23]  Ian D. Reid,et al.  An image-to-map loop closing method for monocular SLAM , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Simon Lacroix,et al.  A practical 3D bearing-only SLAM algorithm , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Yitzhak Birk,et al.  An Empirical Analysis of the IEEE-1394 Serial Bus Protocol , 2000, IEEE Micro.

[26]  Ian D. Reid,et al.  Mapping Large Loops with a Single Hand-Held Camera , 2007, Robotics: Science and Systems.

[27]  Osama Masoud,et al.  Online motion classification using support vector machines , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[28]  William H. Warren,et al.  Wormholes in Virtual Reality: What spatial knowledge is learned for navigation? , 2010 .

[29]  O. Faugeras,et al.  The Geometry of Multiple Images , 1999 .

[30]  Heinrich H. Bülthoff,et al.  Navigating through a virtual city: Using virtual reality technology to study human action and perception , 1998, Future Gener. Comput. Syst..

[31]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Howie Choset,et al.  The hierarchical atlas , 2005, IEEE Transactions on Robotics.

[33]  Tom Drummond,et al.  Fusing points and lines for high performance tracking , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[34]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[35]  V. Caglioti,et al.  Planar Motion Estimation using an Uncalibrated General Camera , 2008 .

[36]  S. J. Perantonis,et al.  Robust line detection using weighted region based Hough transform , 1998 .

[37]  Paul Newman,et al.  A generative framework for fast urban labeling using spatial and temporal context , 2009, Auton. Robots.

[38]  Eduardo Mario Nebot,et al.  Improving computational and memory requirements of simultaneous localization and map building algorithms , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[39]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[40]  Richard I. Hartley,et al.  Projective reconstruction from line correspondences , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[42]  Wolfram Burgard,et al.  Improved Techniques for Grid Mapping With Rao-Blackwellized Particle Filters , 2007, IEEE Transactions on Robotics.

[43]  A. B. Chatfield Fundamentals of high accuracy inertial navigation , 1997 .

[44]  Donald Launer,et al.  Navigation through the Ages , 2009, Nature.

[45]  Michael Werman,et al.  Robot localization using uncalibrated camera invariants , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[46]  Dimitrios Lambrinos,et al.  Insect Strategies of Visual Homing in Mobile Robots , 1998 .

[47]  Isaac Weiss 3-D curve reconstruction from uncalibrated cameras , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[48]  P. Frossard,et al.  Tree-Based Pursuit: Algorithm and Properties , 2006, IEEE Transactions on Signal Processing.

[49]  Douglas Tougaw Finding your way with the Garmin GPS V , 2002, Computing in Science & Engineering.

[50]  Bruno Sinopoli,et al.  Vision based navigation for an unmanned aerial vehicle , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[51]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[52]  Kevin Lynch,et al.  The Image of the City , 1960 .

[53]  Luc Van Gool,et al.  Omnidirectional sparse visual path following with occlusion-robust feature tracking , 2005 .

[54]  Frank Dellaert,et al.  Square Root SAM: Simultaneous Localization and Mapping via Square Root Information Smoothing , 2006, Int. J. Robotics Res..

[55]  Benjamin Kuipers,et al.  Using the topological skeleton for scalable global metrical map-building , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[56]  B.K.P. Horn,et al.  Time to Contact Relative to a Planar Surface , 2007, 2007 IEEE Intelligent Vehicles Symposium.

[57]  J. M. Cortina,et al.  What Is Coefficient Alpha? An Examination of Theory and Applications , 1993 .

[58]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[59]  R Möller,et al.  Do insects use templates or parameters for landmark navigation? , 2001, Journal of theoretical biology.

[60]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[61]  Paul Newman,et al.  Detecting Loop Closure with Scene Sequences , 2007, International Journal of Computer Vision.

[62]  Simon Lacroix,et al.  Vision-Based SLAM: Stereo and Monocular Approaches , 2007, International Journal of Computer Vision.

[63]  Benjamin Kuipers,et al.  Towards Autonomous Topological Place Detection Using the Extended Voronoi Graph , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[64]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[65]  Slimane Larabi,et al.  Obstacle Detection from Uncalibrated Cameras , 2008, 2008 Panhellenic Conference on Informatics.

[66]  D. Schroeter On the Robustness of Visual Homing under Landmark Uncertainty 1 , 2008 .

[67]  Ian D. Reid,et al.  Towards constant time SLAM using postponement , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[68]  James R. Bergen,et al.  Visual odometry for ground vehicle applications , 2006, J. Field Robotics.

[69]  Giulio Sandini,et al.  Uncalibrated obstacle detection using normal flow , 2005, Machine Vision and Applications.

[70]  Roberto Cipolla,et al.  Affine Reconstruction of Curved Surfaces from Uncalibrated Views of Apparent Contours , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[71]  Wolfram Burgard,et al.  Simultaneous Localisation and Mapping in Dynamic Environments (SLAMIDE) with Reversible Data Association , 2008 .

[72]  Janne Heikkilä,et al.  A four-step camera calibration procedure with implicit image correction , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[73]  Edwin Olson,et al.  Robust and efficient robotic mapping , 2008 .

[74]  M A Brodie,et al.  The static accuracy and calibration of inertial measurement units for 3D orientation , 2008, Computer methods in biomechanics and biomedical engineering.

[75]  Stephen M. Smith,et al.  SUSAN—A New Approach to Low Level Image Processing , 1997, International Journal of Computer Vision.

[76]  Olivier Koch Body-relative navigation guidance using uncalibrated cameras , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[77]  Paul A. Beardsley,et al.  Navigation using Affine Structure from Motion , 1994, ECCV.

[78]  Carlos Sagues,et al.  Uncalibrated vision based on lines for robot navigation , 2001 .

[79]  Don Ray Murray,et al.  Using Real-Time Stereo Vision for Mobile Robot Navigation , 2000, Auton. Robots.

[80]  David Filliat Interactive learning of visual topological navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[81]  Javier Civera,et al.  1-point RANSAC for EKF-based Structure from Motion , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[82]  Stephen J. Maybank,et al.  On plane-based camera calibration: A general algorithm, singularities, applications , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[83]  Jean-Arcady Meyer,et al.  Fast and Incremental Method for Loop-Closure Detection Using Bags of Visual Words , 2008, IEEE Transactions on Robotics.

[84]  Hugh F. Durrant-Whyte,et al.  Inertial navigation systems for mobile robots , 1995, IEEE Trans. Robotics Autom..

[85]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[86]  Michel Devy,et al.  Undelayed initialization in bearing only SLAM , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[87]  Dahua Lin,et al.  Learning visual flows: A Lie algebraic approach , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[88]  Luc Van Gool,et al.  Omnidirectional Vision Based Topological Navigation , 2007, International Journal of Computer Vision.

[89]  Albert S. Huang,et al.  Ground robot navigation using uncalibrated cameras , 2010, 2010 IEEE International Conference on Robotics and Automation.

[90]  Mei Han,et al.  Creating 3D models with uncalibrated cameras , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[91]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[92]  Hanumant Singh,et al.  Exactly Sparse Delayed-State Filters , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[93]  Javier Civera,et al.  Unified Inverse Depth Parametrization for Monocular SLAM , 2006, Robotics: Science and Systems.

[94]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[95]  Shree K. Nayar,et al.  Ego-motion and omnidirectional cameras , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[96]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[97]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[98]  Andrew Vardy,et al.  Biologically plausible visual homing methods based on optical flow techniques , 2005, Connect. Sci..

[99]  Hans-Werner Gellersen,et al.  Ultrasound-aided pedestrian dead reckoning for indoor navigation , 2008, MELT '08.

[100]  Effrosini Kokiopoulou,et al.  Mobile Museum Guide Based on Fast SIFT Recognition , 2008, Adaptive Multimedia Retrieval.

[101]  Irem Stratmann,et al.  Omnidirectional Vision and Inertial Clues for Robot Navigation , 2004, J. Field Robotics.

[102]  Luc Van Gool,et al.  Is structure needed for omnidirectional visual homing? , 2005, 2005 International Symposium on Computational Intelligence in Robotics and Automation.